HI Emin,

I think it's ok for single use case. Anyway for the BatchInsert I'll
implement something more general, could you please open a new issue about
this?

Thanks

Luigi


2015-06-19 19:06 GMT+02:00 Emin Agassi <[email protected]>:

> Thank you.
> I also had to modify the OGraphBatchInsert class to support storing BLOBs
> in bulk. I did this code in createVertex and setVertexProperties:
>
> if (properties != null && properties.containsKey("BLOB") ) {
>             String xmlBlob = (String) properties.get("BLOB");
>             ORecordBytes record = new ORecordBytes(xmlBlob.getBytes());
>             record.save();
>             doc.field("BLOB", record);
>             properties.remove("BLOB");
>         }
>
> This makes the bulk slower but not extremely slow.
> Does this look ok to you is there a better method?
>
> Grazie!
> Emin
>
>
>
> On Friday, June 19, 2015 at 12:24:50 PM UTC-4, Luigi Dell'Aquila wrote:
>>
>> Hi Emin,
>>
>> I'm suggesting to use sorted vertex IDs when you insert the properties.
>>
>> Thanks
>>
>> Luigi
>>
>>
>>
>> 2015-06-19 17:20 GMT+02:00 Emin Agassi <[email protected]>:
>>
>>> Hi Luigi,
>>>
>>> What do you mean by insert rate ? How long it takes to load this size of
>>> the graph?
>>> Currently, it takes 4 minutes to load 744,496 rows as Vertices
>>> and 6,445,621 rows as Edges.
>>> I am not sorting Vertix ids and not keeping the same order that I used
>>> for creating Edges.
>>> I would like to confirm that you are suggesting to use Vertix ids in the
>>> same order as used when creating Edges. Correct?
>>> Or, are you suggesting just sort the Vertix ids after I create them for
>>> the createEdges and use the sorted ids for setVertex properties?
>>>
>>> Also, my problem is that these newly generated Ids are not the ids used
>>> in the database. This means that I need to map between new generated ids
>>> and the object ids stored in the DB.
>>> So, I ended up having a HashMap between these new ids and ids coming
>>> from the DB. Otherwise, I do not know which object I am setting vertix
>>> properties for.
>>>
>>> Thanks
>>> Emin
>>>
>>>
>>> On Friday, June 19, 2015 at 10:25:18 AM UTC-4, Luigi Dell'Aquila wrote:
>>>>
>>>> Hi Emin,
>>>>
>>>> the batch insert is very fast because it does a lot of operations in
>>>> RAM and then flushes raw data to disk all together.
>>>> The maximum value of vertex IDs counts a lot, because OrientDB will
>>>> create (and in some cases destroy, during import process) as many records
>>>> as that number, so if you can keep it low you will have better performance.
>>>> The same is for the sorting order of vertex ids in set vertex
>>>> properties, I strongly suggest you to do fully sorted 
>>>> setVertexProperties().
>>>> Anyway, when you invoke setVertexProperties() records are actually
>>>> flushed to the clusters (before that everything happens in RAM), this is
>>>> why at that moment you have a slowdown.
>>>> Out of curiosity, which insert rate are you having?
>>>>
>>>> Thanks
>>>>
>>>> Luigi
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2015-06-19 15:27 GMT+02:00 Emin Agassi <[email protected]>:
>>>>
>>>>>
>>>>> Hello Luigi or Luca,
>>>>>
>>>>> I have a question regarding the new Batch Insert class.
>>>>> I have a database that contains 744,496 rows which are loaded as
>>>>> Vertices and 6,445,621 rows which are loaded as Edges.
>>>>> I am using the new OGraphBatchInsert.
>>>>> I am following the procedure described in the Java comments: create
>>>>> edges first and then set Vertix properties.
>>>>> The createEdge() API requires two Long ids for Vertices.
>>>>> Today, I have an incremental counter that I use to generate these Ids.
>>>>> Then, I use the same ids in the set Vertix properties operation.
>>>>>
>>>>> Questions:
>>>>> Does it matter how large these generated Ids get?
>>>>> Should these ids be used in the same sequential order after creating
>>>>> them for the createEdge and then in the same order for the setVertix
>>>>> properties.
>>>>>
>>>>> Creating Edges is very fast for this size of the graph but setting
>>>>> Vertix properties is slow. Would you know why? Could this be related to 
>>>>> the
>>>>> sorting order of the Vertex Ids between createEdge calls and set Vertix
>>>>> properties?
>>>>>
>>>>> I am not sure why set Vertix properties is so slower. I am not setting
>>>>> large properties. I only have 4 properties to set.
>>>>>
>>>>> Thank you for help
>>>>> Emin
>>>>>
>>>>>  --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "OrientDB" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to