Hi Emin, I'm suggesting to use sorted vertex IDs when you insert the properties.
Thanks Luigi 2015-06-19 17:20 GMT+02:00 Emin Agassi <[email protected]>: > Hi Luigi, > > What do you mean by insert rate ? How long it takes to load this size of > the graph? > Currently, it takes 4 minutes to load 744,496 rows as Vertices > and 6,445,621 rows as Edges. > I am not sorting Vertix ids and not keeping the same order that I used for > creating Edges. > I would like to confirm that you are suggesting to use Vertix ids in the > same order as used when creating Edges. Correct? > Or, are you suggesting just sort the Vertix ids after I create them for > the createEdges and use the sorted ids for setVertex properties? > > Also, my problem is that these newly generated Ids are not the ids used in > the database. This means that I need to map between new generated ids and > the object ids stored in the DB. > So, I ended up having a HashMap between these new ids and ids coming from > the DB. Otherwise, I do not know which object I am setting vertix > properties for. > > Thanks > Emin > > > On Friday, June 19, 2015 at 10:25:18 AM UTC-4, Luigi Dell'Aquila wrote: >> >> Hi Emin, >> >> the batch insert is very fast because it does a lot of operations in RAM >> and then flushes raw data to disk all together. >> The maximum value of vertex IDs counts a lot, because OrientDB will >> create (and in some cases destroy, during import process) as many records >> as that number, so if you can keep it low you will have better performance. >> The same is for the sorting order of vertex ids in set vertex properties, >> I strongly suggest you to do fully sorted setVertexProperties(). >> Anyway, when you invoke setVertexProperties() records are actually >> flushed to the clusters (before that everything happens in RAM), this is >> why at that moment you have a slowdown. >> Out of curiosity, which insert rate are you having? >> >> Thanks >> >> Luigi >> >> >> >> >> >> 2015-06-19 15:27 GMT+02:00 Emin Agassi <[email protected]>: >> >>> >>> Hello Luigi or Luca, >>> >>> I have a question regarding the new Batch Insert class. >>> I have a database that contains 744,496 rows which are loaded as >>> Vertices and 6,445,621 rows which are loaded as Edges. >>> I am using the new OGraphBatchInsert. >>> I am following the procedure described in the Java comments: create >>> edges first and then set Vertix properties. >>> The createEdge() API requires two Long ids for Vertices. >>> Today, I have an incremental counter that I use to generate these Ids. >>> Then, I use the same ids in the set Vertix properties operation. >>> >>> Questions: >>> Does it matter how large these generated Ids get? >>> Should these ids be used in the same sequential order after creating >>> them for the createEdge and then in the same order for the setVertix >>> properties. >>> >>> Creating Edges is very fast for this size of the graph but setting >>> Vertix properties is slow. Would you know why? Could this be related to the >>> sorting order of the Vertex Ids between createEdge calls and set Vertix >>> properties? >>> >>> I am not sure why set Vertix properties is so slower. I am not setting >>> large properties. I only have 4 properties to set. >>> >>> Thank you for help >>> Emin >>> >>> -- >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "OrientDB" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > > --- > You received this message because you are subscribed to the Google Groups > "OrientDB" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
