Hi Luigi,

What do you mean by insert rate ? How long it takes to load this size of 
the graph?
Currently, it takes 4 minutes to load 744,496 rows as Vertices 
and 6,445,621 rows as Edges.
I am not sorting Vertix ids and not keeping the same order that I used for 
creating Edges.
I would like to confirm that you are suggesting to use Vertix ids in the 
same order as used when creating Edges. Correct?
Or, are you suggesting just sort the Vertix ids after I create them for the 
createEdges and use the sorted ids for setVertex properties?

Also, my problem is that these newly generated Ids are not the ids used in 
the database. This means that I need to map between new generated ids and 
the object ids stored in the DB.
So, I ended up having a HashMap between these new ids and ids coming from 
the DB. Otherwise, I do not know which object I am setting vertix 
properties for.

Thanks
Emin


On Friday, June 19, 2015 at 10:25:18 AM UTC-4, Luigi Dell'Aquila wrote:
>
> Hi Emin,
>
> the batch insert is very fast because it does a lot of operations in RAM 
> and then flushes raw data to disk all together.
> The maximum value of vertex IDs counts a lot, because OrientDB will create 
> (and in some cases destroy, during import process) as many records as that 
> number, so if you can keep it low you will have better performance.
> The same is for the sorting order of vertex ids in set vertex properties, 
> I strongly suggest you to do fully sorted setVertexProperties().
> Anyway, when you invoke setVertexProperties() records are actually flushed 
> to the clusters (before that everything happens in RAM), this is why at 
> that moment you have a slowdown.
> Out of curiosity, which insert rate are you having?
>
> Thanks
>
> Luigi
>
>
>
>
>
> 2015-06-19 15:27 GMT+02:00 Emin Agassi <[email protected] <javascript:>
> >:
>
>>
>> Hello Luigi or Luca,
>>
>> I have a question regarding the new Batch Insert class. 
>> I have a database that contains 744,496 rows which are loaded as Vertices 
>> and 6,445,621 rows which are loaded as Edges.
>> I am using the new OGraphBatchInsert.
>> I am following the procedure described in the Java comments: create edges 
>> first and then set Vertix properties.
>> The createEdge() API requires two Long ids for Vertices.
>> Today, I have an incremental counter that I use to generate these Ids.
>> Then, I use the same ids in the set Vertix properties operation.
>>
>> Questions: 
>> Does it matter how large these generated Ids get?
>> Should these ids be used in the same sequential order after creating them 
>> for the createEdge and then in the same order for the setVertix properties.
>>
>> Creating Edges is very fast for this size of the graph but setting Vertix 
>> properties is slow. Would you know why? Could this be related to the 
>> sorting order of the Vertex Ids between createEdge calls and set Vertix 
>> properties?
>>
>> I am not sure why set Vertix properties is so slower. I am not setting 
>> large properties. I only have 4 properties to set.
>>
>> Thank you for help
>> Emin
>>
>>  -- 
>>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to