Re: [orientdb] Re: 1.7rc2 performance results

Giraldo Rosales Thu, 13 Feb 2014 12:41:05 -0800

That would be great! Thanks!!
 On Feb 13, 2014 3:40 PM, "Andrey Yesyev" <[email protected]> wrote:


> I'm about to implement that test for my use-case. I'll post my results as
> soon as possible. I'm not sure they'll be interesting to many people though.
> On Feb 13, 2014 3:36 PM, "Giraldo Rosales" <[email protected]> wrote:
>
>> Yes, a comparison. Knowing some biz refuse to leave relational in cases
>> where a graph should exist. I do know Neo4J is a graph but they recommend
>> doing a double query, the first in neo4j to get the relationship and return
>> the ids, and a second with mysql to get the data. OrientDB would have both
>> combined.
>>
>> Tests would be large use case scenarios. Complex selects, and high
>> inserts/updates. For example, 1 million users linked to an object and
>> traversing down a selected group. The same data between each database.
>> Which is fastest? Would also help developers in OrientDB see where
>> performance may be needed, if any.
>> On Feb 13, 2014 3:24 PM, "Andrey Yesyev" <[email protected]> wrote:
>>
>>> Do you mean compare them?
>>> If so, what exactly? Insertion rate? Query?
>>>
>>> In my opinion, it's not technically correct to compare relational and
>>> graph DBs.
>>>
>>> On Thursday, February 13, 2014 3:19:43 PM UTC-5, Giraldo Rosales wrote:
>>>>
>>>> Would be great if someone would benchmark OrientDB, MySQL (with Joins),
>>>> and MySQL/Neo4J. To get some speed tests. Notice there were some out there
>>>> with older versions of OrientDB (1.3).
>>>>
>>>>
>>>>
>>>>
>>>> On Monday, February 10, 2014 5:54:54 AM UTC-5, Andrey Lomakin wrote:
>>>>>
>>>>> HI all,
>>>>>
>>>>> Thank you all for answers.
>>>>> The main mine concern here is that for benchmarks we should use cases
>>>>> which are close to real.
>>>>>
>>>>> About edges distribution, we use cache to optimize loops in graph, I
>>>>> mean if vertex is created, and then loaded to create edge there is good
>>>>> probability that it will be in cache.
>>>>> Any way I gathered links to benchmarks which we used or are going to
>>>>> use.
>>>>>
>>>>> Here is load test of Wikipedia data https://github.com/laa/
>>>>> orientdb-wikipedia-benchmark
>>>>> and there is very interesting benchmark here https://github.com/Morro/
>>>>> GraphDBBenchmark
>>>>>
>>>>> So if you publish your data using them I will very appreciate  it.
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Feb 9, 2014 at 10:23 PM, Milen Dyankov <[email protected]>wrote:
>>>>>
>>>>>> Hello Andrey Lomakin,
>>>>>>
>>>>>> as I write the original tests that Andrey Yesyev is basing his on, I
>>>>>> thought I need to step in with a word of explanation.
>>>>>>
>>>>>> Let me start by saying your findings are correct, the test indeed
>>>>>> inserts given amount of vertices and the a given amount of edges between
>>>>>> the first two vertices. Generally speaking you are also right saying 
>>>>>> "this
>>>>>> benchmark does not reproduce real test cases". However *it was never
>>>>>> meant to be a general purpose benchmark* (please have a look at the
>>>>>> disclaimer of my original post https://groups.google.com/
>>>>>> forum/#!topicsearchin/orient-database/perfomance%7Csort:
>>>>>> date%7Cspell:true/orient-database/VF_j5rGeffA).
>>>>>>
>>>>>> The purpose of this test was to illustrate the fact that I found
>>>>>> OrientDB to be very slow on inserting edges. In fact getting slower and
>>>>>> slower as the amount of edges increases. I also compared it to Neo4j just
>>>>>> because I wasn't sure whether this is something OrientDB specific or it's
>>>>>> due to the nature of the graph databases in general.
>>>>>>
>>>>>> As far as transactions are concern, my original code did not use
>>>>>> transactions at all (at least not explicitly). According tho the docs 
>>>>>> (back
>>>>>> then) the was supposed execute each operation instantly. I don't know
>>>>>> (didn't have the time to examine Andrey's code) why he introduced
>>>>>> transactions and while I agree inserting millions of documents in a 
>>>>>> single
>>>>>> transaction is not a good idea, I just wanted to point out the original
>>>>>> test was demonstrating the problem with no transactions at all. I'm 
>>>>>> pretty
>>>>>> sure Andrey can easily change the code to commit data in smaller chunks 
>>>>>> but
>>>>>> honestly speaking I don't expect huge improvements (comparing to the no
>>>>>> transaction).
>>>>>>
>>>>>> As far as the structure of the data is concerned, I fail to see how
>>>>>> can that cause performance degradation. Are you saying that if the
>>>>>> test was to create edges between every 2 vertices for example (instead of
>>>>>> just first 2) it would be faster? I highly doubt it. In fact I think the
>>>>>> way the test is written should actually allow OrientDB to perform better
>>>>>> than average as it can utilize cache and doesn't have to look for edges.
>>>>>>
>>>>>> Finally, I have to admin I gave up on OrientDB half a year ago (don't
>>>>>> get me wrong, nothing personal, I just found it not to be mature enough 
>>>>>> for
>>>>>> the project I was working on) and while I'm still trying to keep an eye 
>>>>>> on
>>>>>> this list, I'm not fully aware of all the optimizations that have 
>>>>>> happened
>>>>>> since then. It may me the case that the test is no longer valid for the
>>>>>> current version or needs to be rewritten completely. If I find some spare
>>>>>> time I will try to update my original tests to use the latest version and
>>>>>> post some results here.
>>>>>>
>>>>>> Regards,
>>>>>> Milen
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 9, 2014 at 8:27 PM, Andrey Lomakin 
>>>>>> <[email protected]>wrote:
>>>>>>
>>>>>>> Hi Andrey,
>>>>>>> I started benchmark on my side and while it is running I
>>>>>>> investigated it.
>>>>>>> I think that I should note that this benchmark does not reproduce
>>>>>>> real test cases (dunno what performance data you get on other DBs).
>>>>>>>
>>>>>>> I mean what this benchmark does.
>>>>>>> Lets suppose that we have to insert 1 000 000 documents vertexes and
>>>>>>> edges.
>>>>>>> Then it creates 500 000 vertexes and then takes 2 of them, and
>>>>>>> creates 500 000 edges between them.
>>>>>>> And everything in one transaction.
>>>>>>>
>>>>>>> So we have graph database with 499 998 unconnected vertexes and 2
>>>>>>> vertexes which have 500 000 edges and everything is committed in single
>>>>>>> transaction.
>>>>>>> Did I miss something ?
>>>>>>>
>>>>>>> I mean that I think you do not suppose users to commit such data
>>>>>>> structure and commit it using single transaction.
>>>>>>> Usually data structures are way different and changes are committed
>>>>>>> in following way users load data, change them, commit them.
>>>>>>>
>>>>>>> It is my personal opinion but may be you will be interested in
>>>>>>> performance test which loads real wikipedia data by loading and 
>>>>>>> committing
>>>>>>> them by small batches ?
>>>>>>> Also this tests uses index which is very typical for db usage.
>>>>>>>
>>>>>>> We used such test case so I can change and publish it as maven
>>>>>>> project and because it is tinkerpop based you can test all dbs which you
>>>>>>> are interested in.
>>>>>>> Our load test does not have properties on vertexes only relations
>>>>>>> and index by page key,but it is simple to add additional properties.
>>>>>>>
>>>>>>> What do you think ?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Feb 9, 2014 at 5:59 PM, Andrey Yesyev 
>>>>>>> <[email protected]>wrote:
>>>>>>>
>>>>>>>> Please post your results!
>>>>>>>>
>>>>>>>> Again, any comments regarding source code are very welcome!
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sunday, February 9, 2014 10:50:34 AM UTC-5, Andrey Lomakin wrote:
>>>>>>>>
>>>>>>>>> Andrey,
>>>>>>>>> I do not see any commits in project. https://github.com/ay
>>>>>>>>> esyev/graphdb-tests/commits/master
>>>>>>>>> Did you push them ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Feb 9, 2014 at 5:47 PM, Andrey Lomakin <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Got it ! ))
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sun, Feb 9, 2014 at 5:44 PM, Andrey Lomakin <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Andrey,
>>>>>>>>>>>
>>>>>>>>>>> Could you provide instructions how to run these tests to see
>>>>>>>>>>> statistic results ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Feb 9, 2014 at 4:59 PM, Andrey Yesyev <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Ok, here we go!
>>>>>>>>>>>>
>>>>>>>>>>>> I added all Andrey's tips to the project.
>>>>>>>>>>>>
>>>>>>>>>>>> storage.diskCache.bufferSize set to 14336
>>>>>>>>>>>>
>>>>>>>>>>>> All edges have appropriate number of properties and added this
>>>>>>>>>>>> way
>>>>>>>>>>>>
>>>>>>>>>>>> protected OrientEdge createEdge(Vertex v1, Vertex v2) {
>>>>>>>>>>>>         Map<String, String> properties = new HashMap<String,
>>>>>>>>>>>> String>();
>>>>>>>>>>>>         for (int i = 0; i < numberOfProperties; i++)
>>>>>>>>>>>>             properties.put("property" + i, "value" + i);
>>>>>>>>>>>>         OrientEdge e = ((OrientVertex)v1).addEdge(null,
>>>>>>>>>>>> (OrientVertex)v2, "E", null, properties);
>>>>>>>>>>>>          e.save();
>>>>>>>>>>>>
>>>>>>>>>>>> return e;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> Results are attached for remote and embedded (both using plocal
>>>>>>>>>>>> storage type).
>>>>>>>>>>>> On Monday I'll try to make my conclusions.
>>>>>>>>>>>>
>>>>>>>>>>>> All changes are committed to github project.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  --
>>>>>>>>>>>>
>>>>>>>>>>>> ---
>>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>>> Google Groups "OrientDB" group.
>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>>
>>>>>>>>>>>> For more options, visit https://groups.google.com/grou
>>>>>>>>>>>> ps/opt_out.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Andrey Lomakin.
>>>>>>>>>>>
>>>>>>>>>>> Orient Technologies
>>>>>>>>>>> the Company behind OrientDB
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>> Andrey Lomakin.
>>>>>>>>>>
>>>>>>>>>> Orient Technologies
>>>>>>>>>> the Company behind OrientDB
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>> Andrey Lomakin.
>>>>>>>>>
>>>>>>>>> Orient Technologies
>>>>>>>>> the Company behind OrientDB
>>>>>>>>>
>>>>>>>>>   --
>>>>>>>>
>>>>>>>> ---
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "OrientDB" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Andrey Lomakin.
>>>>>>>
>>>>>>> Orient Technologies
>>>>>>> the Company behind OrientDB
>>>>>>>
>>>>>>>  --
>>>>>>>
>>>>>>> ---
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "OrientDB" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> http://about.me/milen
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "OrientDB" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Andrey Lomakin.
>>>>>
>>>>> Orient Technologies
>>>>> the Company behind OrientDB
>>>>>
>>>>>   --
>>>
>>> ---
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "OrientDB" group.
>>> To unsubscribe from this topic, visit
>>> https://groups.google.com/d/topic/orient-database/QscZaIK5JPU/unsubscribe
>>> .
>>> To unsubscribe from this group and all its topics, send an email to
>>> [email protected].
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>  --
>>
>> ---
>> You received this message because you are subscribed to a topic in the
>> Google Groups "OrientDB" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/orient-database/QscZaIK5JPU/unsubscribe
>> .
>> To unsubscribe from this group and all its topics, send an email to
>> [email protected].
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>  --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "OrientDB" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/orient-database/QscZaIK5JPU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: [orientdb] Re: 1.7rc2 performance results

Reply via email to