Re: [orientdb] Re: 1.7rc2 performance results

Andrey Lomakin Mon, 10 Feb 2014 02:55:31 -0800

HI all,

Thank you all for answers.
The main mine concern here is that for benchmarks we should use cases which
are close to real.


About edges distribution, we use cache to optimize loops in graph, I mean
if vertex is created, and then loaded to create edge there is good
probability that it will be in cache.
Any way I gathered links to benchmarks which we used or are going to use.

Here is load test of Wikipedia data
https://github.com/laa/orientdb-wikipedia-benchmark
and there is very interesting benchmark here
https://github.com/Morro/GraphDBBenchmark

So if you publish your data using them I will very appreciate  it.



On Sun, Feb 9, 2014 at 10:23 PM, Milen Dyankov <[email protected]>wrote:

> Hello Andrey Lomakin,
>
> as I write the original tests that Andrey Yesyev is basing his on, I
> thought I need to step in with a word of explanation.
>
> Let me start by saying your findings are correct, the test indeed inserts
> given amount of vertices and the a given amount of edges between the first
> two vertices. Generally speaking you are also right saying "this
> benchmark does not reproduce real test cases". However *it was never
> meant to be a general purpose benchmark* (please have a look at the
> disclaimer of my original post
> https://groups.google.com/forum/#!topicsearchin/orient-database/perfomance%7Csort:date%7Cspell:true/orient-database/VF_j5rGeffA).
>
>
> The purpose of this test was to illustrate the fact that I found OrientDB
> to be very slow on inserting edges. In fact getting slower and slower as
> the amount of edges increases. I also compared it to Neo4j just because I
> wasn't sure whether this is something OrientDB specific or it's due to the
> nature of the graph databases in general.
>
> As far as transactions are concern, my original code did not use
> transactions at all (at least not explicitly). According tho the docs (back
> then) the was supposed execute each operation instantly. I don't know
> (didn't have the time to examine Andrey's code) why he introduced
> transactions and while I agree inserting millions of documents in a single
> transaction is not a good idea, I just wanted to point out the original
> test was demonstrating the problem with no transactions at all. I'm pretty
> sure Andrey can easily change the code to commit data in smaller chunks but
> honestly speaking I don't expect huge improvements (comparing to the no
> transaction).
>
> As far as the structure of the data is concerned, I fail to see how can
> that cause performance degradation. Are you saying that if the test was
> to create edges between every 2 vertices for example (instead of just first
> 2) it would be faster? I highly doubt it. In fact I think the way the test
> is written should actually allow OrientDB to perform better than average as
> it can utilize cache and doesn't have to look for edges.
>
> Finally, I have to admin I gave up on OrientDB half a year ago (don't get
> me wrong, nothing personal, I just found it not to be mature enough for the
> project I was working on) and while I'm still trying to keep an eye on this
> list, I'm not fully aware of all the optimizations that have happened since
> then. It may me the case that the test is no longer valid for the current
> version or needs to be rewritten completely. If I find some spare time I
> will try to update my original tests to use the latest version and post
> some results here.
>
> Regards,
> Milen
>
>
>
>
>
> On Sun, Feb 9, 2014 at 8:27 PM, Andrey Lomakin 
> <[email protected]>wrote:
>
>> Hi Andrey,
>> I started benchmark on my side and while it is running I investigated it.
>> I think that I should note that this benchmark does not reproduce real
>> test cases (dunno what performance data you get on other DBs).
>>
>> I mean what this benchmark does.
>> Lets suppose that we have to insert 1 000 000 documents vertexes and
>> edges.
>> Then it creates 500 000 vertexes and then takes 2 of them, and creates
>> 500 000 edges between them.
>> And everything in one transaction.
>>
>> So we have graph database with 499 998 unconnected vertexes and 2
>> vertexes which have 500 000 edges and everything is committed in single
>> transaction.
>> Did I miss something ?
>>
>> I mean that I think you do not suppose users to commit such data
>> structure and commit it using single transaction.
>> Usually data structures are way different and changes are committed in
>> following way users load data, change them, commit them.
>>
>> It is my personal opinion but may be you will be interested in
>> performance test which loads real wikipedia data by loading and committing
>> them by small batches ?
>> Also this tests uses index which is very typical for db usage.
>>
>> We used such test case so I can change and publish it as maven project
>> and because it is tinkerpop based you can test all dbs which you are
>> interested in.
>> Our load test does not have properties on vertexes only relations and
>> index by page key,but it is simple to add additional properties.
>>
>> What do you think ?
>>
>>
>>
>> On Sun, Feb 9, 2014 at 5:59 PM, Andrey Yesyev <[email protected]>wrote:
>>
>>> Please post your results!
>>>
>>> Again, any comments regarding source code are very welcome!
>>>
>>>
>>> On Sunday, February 9, 2014 10:50:34 AM UTC-5, Andrey Lomakin wrote:
>>>
>>>> Andrey,
>>>> I do not see any commits in project. https://github.com/
>>>> ayesyev/graphdb-tests/commits/master
>>>> Did you push them ?
>>>>
>>>>
>>>> On Sun, Feb 9, 2014 at 5:47 PM, Andrey Lomakin <[email protected]>wrote:
>>>>
>>>>> Got it ! ))
>>>>>
>>>>>
>>>>> On Sun, Feb 9, 2014 at 5:44 PM, Andrey Lomakin 
>>>>> <[email protected]>wrote:
>>>>>
>>>>>> Hi Andrey,
>>>>>>
>>>>>> Could you provide instructions how to run these tests to see
>>>>>> statistic results ?
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 9, 2014 at 4:59 PM, Andrey Yesyev <[email protected]>wrote:
>>>>>>
>>>>>>> Ok, here we go!
>>>>>>>
>>>>>>> I added all Andrey's tips to the project.
>>>>>>>
>>>>>>> storage.diskCache.bufferSize set to 14336
>>>>>>>
>>>>>>> All edges have appropriate number of properties and added this way
>>>>>>>
>>>>>>> protected OrientEdge createEdge(Vertex v1, Vertex v2) {
>>>>>>>         Map<String, String> properties = new HashMap<String,
>>>>>>> String>();
>>>>>>>         for (int i = 0; i < numberOfProperties; i++)
>>>>>>>             properties.put("property" + i, "value" + i);
>>>>>>>         OrientEdge e = ((OrientVertex)v1).addEdge(null,
>>>>>>> (OrientVertex)v2, "E", null, properties);
>>>>>>>         e.save();
>>>>>>>
>>>>>>> return e;
>>>>>>> }
>>>>>>>
>>>>>>> Results are attached for remote and embedded (both using plocal
>>>>>>> storage type).
>>>>>>> On Monday I'll try to make my conclusions.
>>>>>>>
>>>>>>> All changes are committed to github project.
>>>>>>>
>>>>>>>
>>>>>>>  --
>>>>>>>
>>>>>>> ---
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "OrientDB" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>>
>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>> Andrey Lomakin.
>>>>>>
>>>>>> Orient Technologies
>>>>>> the Company behind OrientDB
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Andrey Lomakin.
>>>>>
>>>>> Orient Technologies
>>>>> the Company behind OrientDB
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Andrey Lomakin.
>>>>
>>>> Orient Technologies
>>>> the Company behind OrientDB
>>>>
>>>>   --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>
>>
>> --
>> Best regards,
>> Andrey Lomakin.
>>
>> Orient Technologies
>> the Company behind OrientDB
>>
>>  --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>
>
> --
> http://about.me/milen
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>



-- 
Best regards,
Andrey Lomakin.

Orient Technologies
the Company behind OrientDB

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: [orientdb] Re: 1.7rc2 performance results

Reply via email to