HI Emin, I think it's ok for single use case. Anyway for the BatchInsert I'll implement something more general, could you please open a new issue about this?
Thanks Luigi 2015-06-19 19:06 GMT+02:00 Emin Agassi <[email protected]>: > Thank you. > I also had to modify the OGraphBatchInsert class to support storing BLOBs > in bulk. I did this code in createVertex and setVertexProperties: > > if (properties != null && properties.containsKey("BLOB") ) { > String xmlBlob = (String) properties.get("BLOB"); > ORecordBytes record = new ORecordBytes(xmlBlob.getBytes()); > record.save(); > doc.field("BLOB", record); > properties.remove("BLOB"); > } > > This makes the bulk slower but not extremely slow. > Does this look ok to you is there a better method? > > Grazie! > Emin > > > > On Friday, June 19, 2015 at 12:24:50 PM UTC-4, Luigi Dell'Aquila wrote: >> >> Hi Emin, >> >> I'm suggesting to use sorted vertex IDs when you insert the properties. >> >> Thanks >> >> Luigi >> >> >> >> 2015-06-19 17:20 GMT+02:00 Emin Agassi <[email protected]>: >> >>> Hi Luigi, >>> >>> What do you mean by insert rate ? How long it takes to load this size of >>> the graph? >>> Currently, it takes 4 minutes to load 744,496 rows as Vertices >>> and 6,445,621 rows as Edges. >>> I am not sorting Vertix ids and not keeping the same order that I used >>> for creating Edges. >>> I would like to confirm that you are suggesting to use Vertix ids in the >>> same order as used when creating Edges. Correct? >>> Or, are you suggesting just sort the Vertix ids after I create them for >>> the createEdges and use the sorted ids for setVertex properties? >>> >>> Also, my problem is that these newly generated Ids are not the ids used >>> in the database. This means that I need to map between new generated ids >>> and the object ids stored in the DB. >>> So, I ended up having a HashMap between these new ids and ids coming >>> from the DB. Otherwise, I do not know which object I am setting vertix >>> properties for. >>> >>> Thanks >>> Emin >>> >>> >>> On Friday, June 19, 2015 at 10:25:18 AM UTC-4, Luigi Dell'Aquila wrote: >>>> >>>> Hi Emin, >>>> >>>> the batch insert is very fast because it does a lot of operations in >>>> RAM and then flushes raw data to disk all together. >>>> The maximum value of vertex IDs counts a lot, because OrientDB will >>>> create (and in some cases destroy, during import process) as many records >>>> as that number, so if you can keep it low you will have better performance. >>>> The same is for the sorting order of vertex ids in set vertex >>>> properties, I strongly suggest you to do fully sorted >>>> setVertexProperties(). >>>> Anyway, when you invoke setVertexProperties() records are actually >>>> flushed to the clusters (before that everything happens in RAM), this is >>>> why at that moment you have a slowdown. >>>> Out of curiosity, which insert rate are you having? >>>> >>>> Thanks >>>> >>>> Luigi >>>> >>>> >>>> >>>> >>>> >>>> 2015-06-19 15:27 GMT+02:00 Emin Agassi <[email protected]>: >>>> >>>>> >>>>> Hello Luigi or Luca, >>>>> >>>>> I have a question regarding the new Batch Insert class. >>>>> I have a database that contains 744,496 rows which are loaded as >>>>> Vertices and 6,445,621 rows which are loaded as Edges. >>>>> I am using the new OGraphBatchInsert. >>>>> I am following the procedure described in the Java comments: create >>>>> edges first and then set Vertix properties. >>>>> The createEdge() API requires two Long ids for Vertices. >>>>> Today, I have an incremental counter that I use to generate these Ids. >>>>> Then, I use the same ids in the set Vertix properties operation. >>>>> >>>>> Questions: >>>>> Does it matter how large these generated Ids get? >>>>> Should these ids be used in the same sequential order after creating >>>>> them for the createEdge and then in the same order for the setVertix >>>>> properties. >>>>> >>>>> Creating Edges is very fast for this size of the graph but setting >>>>> Vertix properties is slow. Would you know why? Could this be related to >>>>> the >>>>> sorting order of the Vertex Ids between createEdge calls and set Vertix >>>>> properties? >>>>> >>>>> I am not sure why set Vertix properties is so slower. I am not setting >>>>> large properties. I only have 4 properties to set. >>>>> >>>>> Thank you for help >>>>> Emin >>>>> >>>>> -- >>>>> >>>>> --- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "OrientDB" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "OrientDB" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > > --- > You received this message because you are subscribed to the Google Groups > "OrientDB" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
