Re: [orientdb] Re: 1.7rc2 performance results

Giraldo Rosales Thu, 13 Feb 2014 12:20:09 -0800

Would be great if someone would benchmark OrientDB, MySQL (with Joins), and 
MySQL/Neo4J. To get some speed tests. Notice there were some out there with 
older versions of OrientDB (1.3).





On Monday, February 10, 2014 5:54:54 AM UTC-5, Andrey Lomakin wrote:
>
> HI all,
>
> Thank you all for answers.
> The main mine concern here is that for benchmarks we should use cases 
> which are close to real. 
>
> About edges distribution, we use cache to optimize loops in graph, I mean 
> if vertex is created, and then loaded to create edge there is good 
> probability that it will be in cache.
> Any way I gathered links to benchmarks which we used or are going to use.
>
> Here is load test of Wikipedia data 
> https://github.com/laa/orientdb-wikipedia-benchmark
> and there is very interesting benchmark here 
> https://github.com/Morro/GraphDBBenchmark 
>
> So if you publish your data using them I will very appreciate  it.
>
>
>
> On Sun, Feb 9, 2014 at 10:23 PM, Milen Dyankov 
> <[email protected]<javascript:>
> > wrote:
>
>> Hello Andrey Lomakin,
>>
>> as I write the original tests that Andrey Yesyev is basing his on, I 
>> thought I need to step in with a word of explanation. 
>>
>> Let me start by saying your findings are correct, the test indeed inserts 
>> given amount of vertices and the a given amount of edges between the first 
>> two vertices. Generally speaking you are also right saying "this 
>> benchmark does not reproduce real test cases". However *it was never 
>> meant to be a general purpose benchmark* (please have a look at the 
>> disclaimer of my original post 
>> https://groups.google.com/forum/#!topicsearchin/orient-database/perfomance%7Csort:date%7Cspell:true/orient-database/VF_j5rGeffA).
>>  
>>
>>
>> The purpose of this test was to illustrate the fact that I found OrientDB 
>> to be very slow on inserting edges. In fact getting slower and slower as 
>> the amount of edges increases. I also compared it to Neo4j just because I 
>> wasn't sure whether this is something OrientDB specific or it's due to the 
>> nature of the graph databases in general. 
>>
>> As far as transactions are concern, my original code did not use 
>> transactions at all (at least not explicitly). According tho the docs (back 
>> then) the was supposed execute each operation instantly. I don't know 
>> (didn't have the time to examine Andrey's code) why he introduced 
>> transactions and while I agree inserting millions of documents in a single 
>> transaction is not a good idea, I just wanted to point out the original 
>> test was demonstrating the problem with no transactions at all. I'm pretty 
>> sure Andrey can easily change the code to commit data in smaller chunks but 
>> honestly speaking I don't expect huge improvements (comparing to the no 
>> transaction). 
>>
>> As far as the structure of the data is concerned, I fail to see how can 
>> that cause performance degradation. Are you saying that if the test was 
>> to create edges between every 2 vertices for example (instead of just first 
>> 2) it would be faster? I highly doubt it. In fact I think the way the test 
>> is written should actually allow OrientDB to perform better than average as 
>> it can utilize cache and doesn't have to look for edges.
>>
>> Finally, I have to admin I gave up on OrientDB half a year ago (don't get 
>> me wrong, nothing personal, I just found it not to be mature enough for the 
>> project I was working on) and while I'm still trying to keep an eye on this 
>> list, I'm not fully aware of all the optimizations that have happened since 
>> then. It may me the case that the test is no longer valid for the current 
>> version or needs to be rewritten completely. If I find some spare time I 
>> will try to update my original tests to use the latest version and post 
>> some results here. 
>>
>> Regards,
>> Milen
>>       
>>
>>
>>  
>>
>> On Sun, Feb 9, 2014 at 8:27 PM, Andrey Lomakin 
>> <[email protected]<javascript:>
>> > wrote:
>>
>>> Hi Andrey,
>>> I started benchmark on my side and while it is running I investigated it.
>>> I think that I should note that this benchmark does not reproduce real 
>>> test cases (dunno what performance data you get on other DBs).
>>>
>>> I mean what this benchmark does.
>>> Lets suppose that we have to insert 1 000 000 documents vertexes and 
>>> edges.
>>> Then it creates 500 000 vertexes and then takes 2 of them, and creates 
>>> 500 000 edges between them.
>>> And everything in one transaction.
>>>
>>> So we have graph database with 499 998 unconnected vertexes and 2 
>>> vertexes which have 500 000 edges and everything is committed in single 
>>> transaction.
>>> Did I miss something ?
>>>
>>> I mean that I think you do not suppose users to commit such data 
>>> structure and commit it using single transaction.
>>> Usually data structures are way different and changes are committed in 
>>> following way users load data, change them, commit them.
>>>
>>> It is my personal opinion but may be you will be interested in 
>>> performance test which loads real wikipedia data by loading and committing 
>>> them by small batches ?
>>> Also this tests uses index which is very typical for db usage.
>>>
>>> We used such test case so I can change and publish it as maven project 
>>> and because it is tinkerpop based you can test all dbs which you are 
>>> interested in.
>>> Our load test does not have properties on vertexes only relations and 
>>> index by page key,but it is simple to add additional properties.
>>>
>>> What do you think ?
>>>
>>>
>>>
>>> On Sun, Feb 9, 2014 at 5:59 PM, Andrey Yesyev 
>>> <[email protected]<javascript:>
>>> > wrote:
>>>
>>>> Please post your results!
>>>>
>>>> Again, any comments regarding source code are very welcome!
>>>>
>>>>
>>>> On Sunday, February 9, 2014 10:50:34 AM UTC-5, Andrey Lomakin wrote:
>>>>
>>>>> Andrey,
>>>>> I do not see any commits in project. https://github.com/
>>>>> ayesyev/graphdb-tests/commits/master
>>>>> Did you push them ?
>>>>>
>>>>>
>>>>> On Sun, Feb 9, 2014 at 5:47 PM, Andrey Lomakin 
>>>>> <[email protected]>wrote:
>>>>>
>>>>>> Got it ! ))
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 9, 2014 at 5:44 PM, Andrey Lomakin 
>>>>>> <[email protected]>wrote:
>>>>>>
>>>>>>> Hi Andrey,
>>>>>>>
>>>>>>> Could you provide instructions how to run these tests to see 
>>>>>>> statistic results ?
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Feb 9, 2014 at 4:59 PM, Andrey Yesyev 
>>>>>>> <[email protected]>wrote:
>>>>>>>
>>>>>>>> Ok, here we go!
>>>>>>>>
>>>>>>>> I added all Andrey's tips to the project.
>>>>>>>>
>>>>>>>> storage.diskCache.bufferSize set to 14336
>>>>>>>>
>>>>>>>> All edges have appropriate number of properties and added this way
>>>>>>>>
>>>>>>>> protected OrientEdge createEdge(Vertex v1, Vertex v2) {
>>>>>>>>         Map<String, String> properties = new HashMap<String, 
>>>>>>>> String>();
>>>>>>>>         for (int i = 0; i < numberOfProperties; i++)
>>>>>>>>             properties.put("property" + i, "value" + i);
>>>>>>>>         OrientEdge e = ((OrientVertex)v1).addEdge(null, 
>>>>>>>> (OrientVertex)v2, "E", null, properties);
>>>>>>>>         e.save();
>>>>>>>>
>>>>>>>> return e;
>>>>>>>> }
>>>>>>>>
>>>>>>>> Results are attached for remote and embedded (both using plocal 
>>>>>>>> storage type).
>>>>>>>> On Monday I'll try to make my conclusions.
>>>>>>>>
>>>>>>>> All changes are committed to github project.
>>>>>>>>
>>>>>>>>
>>>>>>>>  -- 
>>>>>>>>  
>>>>>>>> --- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "OrientDB" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected].
>>>>>>>>
>>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> Best regards,
>>>>>>> Andrey Lomakin.
>>>>>>>
>>>>>>> Orient Technologies
>>>>>>> the Company behind OrientDB
>>>>>>>
>>>>>>>  
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Best regards,
>>>>>> Andrey Lomakin.
>>>>>>
>>>>>> Orient Technologies
>>>>>> the Company behind OrientDB
>>>>>>
>>>>>>  
>>>>>
>>>>>
>>>>> -- 
>>>>> Best regards,
>>>>> Andrey Lomakin.
>>>>>
>>>>> Orient Technologies
>>>>> the Company behind OrientDB
>>>>>
>>>>>   -- 
>>>>  
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "OrientDB" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected] <javascript:>.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>
>>>
>>>
>>>
>>> -- 
>>> Best regards,
>>> Andrey Lomakin.
>>>
>>> Orient Technologies
>>> the Company behind OrientDB
>>>
>>>  -- 
>>>  
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected] <javascript:>.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>
>>
>> -- 
>> http://about.me/milen
>>  
>> -- 
>>  
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>
>
> -- 
> Best regards,
> Andrey Lomakin.
>
> Orient Technologies
> the Company behind OrientDB
>
>  

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: [orientdb] Re: 1.7rc2 performance results

Reply via email to