Simon ... Are you using light-weight edges, or heavy-weight edges for your application?
OrientDB represents LW edges as properties of the nodes they link, whereas HW edges are represented as unique objects and the nodes link to the the edges, which maintain the links to the opposite nodes. An example to explain: Suppose that you have two classes (foo, cluster id #21, and bar, cluster id #22) and you want to create a LW edge between them (eg: has). For a LW edge such as foo(#21:1)--(has)->bar(#22:1). then foo(#21:1) will have a list property, out_has that will have one value [#22:1] and bar(#22:1) will have a list property, in_has that will have one value [#21:1]. For a HW edge, you create a class (has, cluster id #23), and the property values change: foo(#21:1) has out_has [#23:1], bar(#22:1) has in_has [#23:1], and the new has(#23:1) has two properties: in=#21:1 and out=#22:1. When you add a second 'has' edge from a foo node, then the out-has list gets another RID (either to a bar node for LW edges, or to a has edge for HW edges). This is OK for dozens of edges to and from individual nodes, but tends to fall down when you have thousands or hundreds of thousands edges connecting nodes (call them super nodes), because those lists have to be (1) converted back-and-forth to array lists when you load the node into memory, (2) have to be linearly searched any time you want to manipulate an existing edge. You can reduce the performance hit under certain circumstances by using HW edges and creating indexes on the edge classes (LW edges can't be indexed). This can make it faster to determine "does this edge exist" before creating it. I built a sample app to measure edge creation performance; you can see it here: https://github.com/wcraigtrader/ogp - Craig - On Wed, Oct 14, 2015 at 4:03 PM, Simon White < [email protected]> wrote: > > I am using Orientdb v2.1.2 > > I need to insert several million edges between two EXISTING vertex. > > I started off using the .NET API as my previous codebase (using an RDBMS) > was c#-based however performance was terrible (~100 edges a second) > > I have completely re-written the code base in java now but unfortunately > performance has not improved much. I was hoping to see at least several > thousand edges created per second. I wonder if anyone could tell me if I am > on the right track... > > > Note: I already know the string ORID for the two vertex. The model is > simply lots of V(from) =====> E =====> V(to) from vertex > are unique (3million), to vertex are often common (50k) > > > OrientGraphFactory factory = new > OrientGraphFactory("remote:localhost/TEST"); > OrientGraph graph = factory.getTx(); > > OrientVertex fromVertex = graph.getVertex(fromORID); > OrientVertex toVertex = graph.getVertex(toORID); > > graph.addEdge("class:from_relates_to", fromVertex , toVertex , > "from_relates_to"); > > after several thousand I do > > graph.commit(); > > > graph.getVertex seems to be a particular bottleneck > > Thank you in anticipation... > > -- > > --- > You received this message because you are subscribed to the Google Groups > "OrientDB" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
