Would be great if someone would benchmark OrientDB, MySQL (with Joins), and MySQL/Neo4J. To get some speed tests. Notice there were some out there with older versions of OrientDB (1.3).
On Monday, February 10, 2014 5:54:54 AM UTC-5, Andrey Lomakin wrote: > > HI all, > > Thank you all for answers. > The main mine concern here is that for benchmarks we should use cases > which are close to real. > > About edges distribution, we use cache to optimize loops in graph, I mean > if vertex is created, and then loaded to create edge there is good > probability that it will be in cache. > Any way I gathered links to benchmarks which we used or are going to use. > > Here is load test of Wikipedia data > https://github.com/laa/orientdb-wikipedia-benchmark > and there is very interesting benchmark here > https://github.com/Morro/GraphDBBenchmark > > So if you publish your data using them I will very appreciate it. > > > > On Sun, Feb 9, 2014 at 10:23 PM, Milen Dyankov > <[email protected]<javascript:> > > wrote: > >> Hello Andrey Lomakin, >> >> as I write the original tests that Andrey Yesyev is basing his on, I >> thought I need to step in with a word of explanation. >> >> Let me start by saying your findings are correct, the test indeed inserts >> given amount of vertices and the a given amount of edges between the first >> two vertices. Generally speaking you are also right saying "this >> benchmark does not reproduce real test cases". However *it was never >> meant to be a general purpose benchmark* (please have a look at the >> disclaimer of my original post >> https://groups.google.com/forum/#!topicsearchin/orient-database/perfomance%7Csort:date%7Cspell:true/orient-database/VF_j5rGeffA). >> >> >> >> The purpose of this test was to illustrate the fact that I found OrientDB >> to be very slow on inserting edges. In fact getting slower and slower as >> the amount of edges increases. I also compared it to Neo4j just because I >> wasn't sure whether this is something OrientDB specific or it's due to the >> nature of the graph databases in general. >> >> As far as transactions are concern, my original code did not use >> transactions at all (at least not explicitly). According tho the docs (back >> then) the was supposed execute each operation instantly. I don't know >> (didn't have the time to examine Andrey's code) why he introduced >> transactions and while I agree inserting millions of documents in a single >> transaction is not a good idea, I just wanted to point out the original >> test was demonstrating the problem with no transactions at all. I'm pretty >> sure Andrey can easily change the code to commit data in smaller chunks but >> honestly speaking I don't expect huge improvements (comparing to the no >> transaction). >> >> As far as the structure of the data is concerned, I fail to see how can >> that cause performance degradation. Are you saying that if the test was >> to create edges between every 2 vertices for example (instead of just first >> 2) it would be faster? I highly doubt it. In fact I think the way the test >> is written should actually allow OrientDB to perform better than average as >> it can utilize cache and doesn't have to look for edges. >> >> Finally, I have to admin I gave up on OrientDB half a year ago (don't get >> me wrong, nothing personal, I just found it not to be mature enough for the >> project I was working on) and while I'm still trying to keep an eye on this >> list, I'm not fully aware of all the optimizations that have happened since >> then. It may me the case that the test is no longer valid for the current >> version or needs to be rewritten completely. If I find some spare time I >> will try to update my original tests to use the latest version and post >> some results here. >> >> Regards, >> Milen >> >> >> >> >> >> On Sun, Feb 9, 2014 at 8:27 PM, Andrey Lomakin >> <[email protected]<javascript:> >> > wrote: >> >>> Hi Andrey, >>> I started benchmark on my side and while it is running I investigated it. >>> I think that I should note that this benchmark does not reproduce real >>> test cases (dunno what performance data you get on other DBs). >>> >>> I mean what this benchmark does. >>> Lets suppose that we have to insert 1 000 000 documents vertexes and >>> edges. >>> Then it creates 500 000 vertexes and then takes 2 of them, and creates >>> 500 000 edges between them. >>> And everything in one transaction. >>> >>> So we have graph database with 499 998 unconnected vertexes and 2 >>> vertexes which have 500 000 edges and everything is committed in single >>> transaction. >>> Did I miss something ? >>> >>> I mean that I think you do not suppose users to commit such data >>> structure and commit it using single transaction. >>> Usually data structures are way different and changes are committed in >>> following way users load data, change them, commit them. >>> >>> It is my personal opinion but may be you will be interested in >>> performance test which loads real wikipedia data by loading and committing >>> them by small batches ? >>> Also this tests uses index which is very typical for db usage. >>> >>> We used such test case so I can change and publish it as maven project >>> and because it is tinkerpop based you can test all dbs which you are >>> interested in. >>> Our load test does not have properties on vertexes only relations and >>> index by page key,but it is simple to add additional properties. >>> >>> What do you think ? >>> >>> >>> >>> On Sun, Feb 9, 2014 at 5:59 PM, Andrey Yesyev >>> <[email protected]<javascript:> >>> > wrote: >>> >>>> Please post your results! >>>> >>>> Again, any comments regarding source code are very welcome! >>>> >>>> >>>> On Sunday, February 9, 2014 10:50:34 AM UTC-5, Andrey Lomakin wrote: >>>> >>>>> Andrey, >>>>> I do not see any commits in project. https://github.com/ >>>>> ayesyev/graphdb-tests/commits/master >>>>> Did you push them ? >>>>> >>>>> >>>>> On Sun, Feb 9, 2014 at 5:47 PM, Andrey Lomakin >>>>> <[email protected]>wrote: >>>>> >>>>>> Got it ! )) >>>>>> >>>>>> >>>>>> On Sun, Feb 9, 2014 at 5:44 PM, Andrey Lomakin >>>>>> <[email protected]>wrote: >>>>>> >>>>>>> Hi Andrey, >>>>>>> >>>>>>> Could you provide instructions how to run these tests to see >>>>>>> statistic results ? >>>>>>> >>>>>>> >>>>>>> On Sun, Feb 9, 2014 at 4:59 PM, Andrey Yesyev >>>>>>> <[email protected]>wrote: >>>>>>> >>>>>>>> Ok, here we go! >>>>>>>> >>>>>>>> I added all Andrey's tips to the project. >>>>>>>> >>>>>>>> storage.diskCache.bufferSize set to 14336 >>>>>>>> >>>>>>>> All edges have appropriate number of properties and added this way >>>>>>>> >>>>>>>> protected OrientEdge createEdge(Vertex v1, Vertex v2) { >>>>>>>> Map<String, String> properties = new HashMap<String, >>>>>>>> String>(); >>>>>>>> for (int i = 0; i < numberOfProperties; i++) >>>>>>>> properties.put("property" + i, "value" + i); >>>>>>>> OrientEdge e = ((OrientVertex)v1).addEdge(null, >>>>>>>> (OrientVertex)v2, "E", null, properties); >>>>>>>> e.save(); >>>>>>>> >>>>>>>> return e; >>>>>>>> } >>>>>>>> >>>>>>>> Results are attached for remote and embedded (both using plocal >>>>>>>> storage type). >>>>>>>> On Monday I'll try to make my conclusions. >>>>>>>> >>>>>>>> All changes are committed to github project. >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> --- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "OrientDB" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> >>>>>>>> For more options, visit https://groups.google.com/groups/opt_out. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best regards, >>>>>>> Andrey Lomakin. >>>>>>> >>>>>>> Orient Technologies >>>>>>> the Company behind OrientDB >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best regards, >>>>>> Andrey Lomakin. >>>>>> >>>>>> Orient Technologies >>>>>> the Company behind OrientDB >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best regards, >>>>> Andrey Lomakin. >>>>> >>>>> Orient Technologies >>>>> the Company behind OrientDB >>>>> >>>>> -- >>>> >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "OrientDB" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected] <javascript:>. >>>> For more options, visit https://groups.google.com/groups/opt_out. >>>> >>> >>> >>> >>> -- >>> Best regards, >>> Andrey Lomakin. >>> >>> Orient Technologies >>> the Company behind OrientDB >>> >>> -- >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "OrientDB" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected] <javascript:>. >>> For more options, visit https://groups.google.com/groups/opt_out. >>> >> >> >> >> -- >> http://about.me/milen >> >> -- >> >> --- >> You received this message because you are subscribed to the Google Groups >> "OrientDB" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/groups/opt_out. >> > > > > -- > Best regards, > Andrey Lomakin. > > Orient Technologies > the Company behind OrientDB > > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
