Thank you.
I also had to modify the OGraphBatchInsert class to support storing BLOBs
in bulk. I did this code in createVertex and setVertexProperties:
if (properties != null && properties.containsKey("BLOB") ) {
String xmlBlob = (String) properties.get("BLOB");
ORecordBytes record = new ORecordBytes(xmlBlob.getBytes());
record.save();
doc.field("BLOB", record);
properties.remove("BLOB");
}
This makes the bulk slower but not extremely slow.
Does this look ok to you is there a better method?
Grazie!
Emin
On Friday, June 19, 2015 at 12:24:50 PM UTC-4, Luigi Dell'Aquila wrote:
>
> Hi Emin,
>
> I'm suggesting to use sorted vertex IDs when you insert the properties.
>
> Thanks
>
> Luigi
>
>
>
> 2015-06-19 17:20 GMT+02:00 Emin Agassi <[email protected] <javascript:>
> >:
>
>> Hi Luigi,
>>
>> What do you mean by insert rate ? How long it takes to load this size of
>> the graph?
>> Currently, it takes 4 minutes to load 744,496 rows as Vertices
>> and 6,445,621 rows as Edges.
>> I am not sorting Vertix ids and not keeping the same order that I used
>> for creating Edges.
>> I would like to confirm that you are suggesting to use Vertix ids in the
>> same order as used when creating Edges. Correct?
>> Or, are you suggesting just sort the Vertix ids after I create them for
>> the createEdges and use the sorted ids for setVertex properties?
>>
>> Also, my problem is that these newly generated Ids are not the ids used
>> in the database. This means that I need to map between new generated ids
>> and the object ids stored in the DB.
>> So, I ended up having a HashMap between these new ids and ids coming from
>> the DB. Otherwise, I do not know which object I am setting vertix
>> properties for.
>>
>> Thanks
>> Emin
>>
>>
>> On Friday, June 19, 2015 at 10:25:18 AM UTC-4, Luigi Dell'Aquila wrote:
>>>
>>> Hi Emin,
>>>
>>> the batch insert is very fast because it does a lot of operations in RAM
>>> and then flushes raw data to disk all together.
>>> The maximum value of vertex IDs counts a lot, because OrientDB will
>>> create (and in some cases destroy, during import process) as many records
>>> as that number, so if you can keep it low you will have better performance.
>>> The same is for the sorting order of vertex ids in set vertex
>>> properties, I strongly suggest you to do fully sorted setVertexProperties().
>>> Anyway, when you invoke setVertexProperties() records are actually
>>> flushed to the clusters (before that everything happens in RAM), this is
>>> why at that moment you have a slowdown.
>>> Out of curiosity, which insert rate are you having?
>>>
>>> Thanks
>>>
>>> Luigi
>>>
>>>
>>>
>>>
>>>
>>> 2015-06-19 15:27 GMT+02:00 Emin Agassi <[email protected]>:
>>>
>>>>
>>>> Hello Luigi or Luca,
>>>>
>>>> I have a question regarding the new Batch Insert class.
>>>> I have a database that contains 744,496 rows which are loaded as
>>>> Vertices and 6,445,621 rows which are loaded as Edges.
>>>> I am using the new OGraphBatchInsert.
>>>> I am following the procedure described in the Java comments: create
>>>> edges first and then set Vertix properties.
>>>> The createEdge() API requires two Long ids for Vertices.
>>>> Today, I have an incremental counter that I use to generate these Ids.
>>>> Then, I use the same ids in the set Vertix properties operation.
>>>>
>>>> Questions:
>>>> Does it matter how large these generated Ids get?
>>>> Should these ids be used in the same sequential order after creating
>>>> them for the createEdge and then in the same order for the setVertix
>>>> properties.
>>>>
>>>> Creating Edges is very fast for this size of the graph but setting
>>>> Vertix properties is slow. Would you know why? Could this be related to
>>>> the
>>>> sorting order of the Vertex Ids between createEdge calls and set Vertix
>>>> properties?
>>>>
>>>> I am not sure why set Vertix properties is so slower. I am not setting
>>>> large properties. I only have 4 properties to set.
>>>>
>>>> Thank you for help
>>>> Emin
>>>>
>>>> --
>>>>
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "OrientDB" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.