Re: [orientdb] Re: 1.7rc2 performance results

Giraldo Rosales Thu, 13 Feb 2014 12:37:08 -0800

Yes, a comparison. Knowing some biz refuse to leave relational in cases
where a graph should exist. I do know Neo4J is a graph but they recommend
doing a double query, the first in neo4j to get the relationship and return
the ids, and a second with mysql to get the data. OrientDB would have both
combined.


Tests would be large use case scenarios. Complex selects, and high
inserts/updates. For example, 1 million users linked to an object and
traversing down a selected group. The same data between each database.
Which is fastest? Would also help developers in OrientDB see where
performance may be needed, if any.
On Feb 13, 2014 3:24 PM, "Andrey Yesyev" <[email protected]> wrote:

> Do you mean compare them?
> If so, what exactly? Insertion rate? Query?
>
> In my opinion, it's not technically correct to compare relational and
> graph DBs.
>
> On Thursday, February 13, 2014 3:19:43 PM UTC-5, Giraldo Rosales wrote:
>>
>> Would be great if someone would benchmark OrientDB, MySQL (with Joins),
>> and MySQL/Neo4J. To get some speed tests. Notice there were some out there
>> with older versions of OrientDB (1.3).
>>
>>
>>
>>
>> On Monday, February 10, 2014 5:54:54 AM UTC-5, Andrey Lomakin wrote:
>>>
>>> HI all,
>>>
>>> Thank you all for answers.
>>> The main mine concern here is that for benchmarks we should use cases
>>> which are close to real.
>>>
>>> About edges distribution, we use cache to optimize loops in graph, I
>>> mean if vertex is created, and then loaded to create edge there is good
>>> probability that it will be in cache.
>>> Any way I gathered links to benchmarks which we used or are going to use.
>>>
>>> Here is load test of Wikipedia data https://github.com/laa/
>>> orientdb-wikipedia-benchmark
>>> and there is very interesting benchmark here https://github.com/Morro/
>>> GraphDBBenchmark
>>>
>>> So if you publish your data using them I will very appreciate  it.
>>>
>>>
>>>
>>> On Sun, Feb 9, 2014 at 10:23 PM, Milen Dyankov <[email protected]>wrote:
>>>
>>>> Hello Andrey Lomakin,
>>>>
>>>> as I write the original tests that Andrey Yesyev is basing his on, I
>>>> thought I need to step in with a word of explanation.
>>>>
>>>> Let me start by saying your findings are correct, the test indeed
>>>> inserts given amount of vertices and the a given amount of edges between
>>>> the first two vertices. Generally speaking you are also right saying "this
>>>> benchmark does not reproduce real test cases". However *it was never
>>>> meant to be a general purpose benchmark* (please have a look at the
>>>> disclaimer of my original post https://groups.google.com/
>>>> forum/#!topicsearchin/orient-database/perfomance%7Csort:
>>>> date%7Cspell:true/orient-database/VF_j5rGeffA).
>>>>
>>>> The purpose of this test was to illustrate the fact that I found
>>>> OrientDB to be very slow on inserting edges. In fact getting slower and
>>>> slower as the amount of edges increases. I also compared it to Neo4j just
>>>> because I wasn't sure whether this is something OrientDB specific or it's
>>>> due to the nature of the graph databases in general.
>>>>
>>>> As far as transactions are concern, my original code did not use
>>>> transactions at all (at least not explicitly). According tho the docs (back
>>>> then) the was supposed execute each operation instantly. I don't know
>>>> (didn't have the time to examine Andrey's code) why he introduced
>>>> transactions and while I agree inserting millions of documents in a single
>>>> transaction is not a good idea, I just wanted to point out the original
>>>> test was demonstrating the problem with no transactions at all. I'm pretty
>>>> sure Andrey can easily change the code to commit data in smaller chunks but
>>>> honestly speaking I don't expect huge improvements (comparing to the no
>>>> transaction).
>>>>
>>>> As far as the structure of the data is concerned, I fail to see how can
>>>> that cause performance degradation. Are you saying that if the test
>>>> was to create edges between every 2 vertices for example (instead of just
>>>> first 2) it would be faster? I highly doubt it. In fact I think the way the
>>>> test is written should actually allow OrientDB to perform better than
>>>> average as it can utilize cache and doesn't have to look for edges.
>>>>
>>>> Finally, I have to admin I gave up on OrientDB half a year ago (don't
>>>> get me wrong, nothing personal, I just found it not to be mature enough for
>>>> the project I was working on) and while I'm still trying to keep an eye on
>>>> this list, I'm not fully aware of all the optimizations that have happened
>>>> since then. It may me the case that the test is no longer valid for the
>>>> current version or needs to be rewritten completely. If I find some spare
>>>> time I will try to update my original tests to use the latest version and
>>>> post some results here.
>>>>
>>>> Regards,
>>>> Milen
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sun, Feb 9, 2014 at 8:27 PM, Andrey Lomakin <[email protected]>wrote:
>>>>
>>>>> Hi Andrey,
>>>>> I started benchmark on my side and while it is running I investigated
>>>>> it.
>>>>> I think that I should note that this benchmark does not reproduce real
>>>>> test cases (dunno what performance data you get on other DBs).
>>>>>
>>>>> I mean what this benchmark does.
>>>>> Lets suppose that we have to insert 1 000 000 documents vertexes and
>>>>> edges.
>>>>> Then it creates 500 000 vertexes and then takes 2 of them, and creates
>>>>> 500 000 edges between them.
>>>>> And everything in one transaction.
>>>>>
>>>>> So we have graph database with 499 998 unconnected vertexes and 2
>>>>> vertexes which have 500 000 edges and everything is committed in single
>>>>> transaction.
>>>>> Did I miss something ?
>>>>>
>>>>> I mean that I think you do not suppose users to commit such data
>>>>> structure and commit it using single transaction.
>>>>> Usually data structures are way different and changes are committed in
>>>>> following way users load data, change them, commit them.
>>>>>
>>>>> It is my personal opinion but may be you will be interested in
>>>>> performance test which loads real wikipedia data by loading and committing
>>>>> them by small batches ?
>>>>> Also this tests uses index which is very typical for db usage.
>>>>>
>>>>> We used such test case so I can change and publish it as maven project
>>>>> and because it is tinkerpop based you can test all dbs which you are
>>>>> interested in.
>>>>> Our load test does not have properties on vertexes only relations and
>>>>> index by page key,but it is simple to add additional properties.
>>>>>
>>>>> What do you think ?
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Feb 9, 2014 at 5:59 PM, Andrey Yesyev <[email protected]>wrote:
>>>>>
>>>>>> Please post your results!
>>>>>>
>>>>>> Again, any comments regarding source code are very welcome!
>>>>>>
>>>>>>
>>>>>> On Sunday, February 9, 2014 10:50:34 AM UTC-5, Andrey Lomakin wrote:
>>>>>>
>>>>>>> Andrey,
>>>>>>> I do not see any commits in project. https://github.com/ay
>>>>>>> esyev/graphdb-tests/commits/master
>>>>>>> Did you push them ?
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Feb 9, 2014 at 5:47 PM, Andrey Lomakin <[email protected]
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Got it ! ))
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Feb 9, 2014 at 5:44 PM, Andrey Lomakin <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi Andrey,
>>>>>>>>>
>>>>>>>>> Could you provide instructions how to run these tests to see
>>>>>>>>> statistic results ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Feb 9, 2014 at 4:59 PM, Andrey Yesyev <[email protected]
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> Ok, here we go!
>>>>>>>>>>
>>>>>>>>>> I added all Andrey's tips to the project.
>>>>>>>>>>
>>>>>>>>>> storage.diskCache.bufferSize set to 14336
>>>>>>>>>>
>>>>>>>>>> All edges have appropriate number of properties and added this way
>>>>>>>>>>
>>>>>>>>>> protected OrientEdge createEdge(Vertex v1, Vertex v2) {
>>>>>>>>>>         Map<String, String> properties = new HashMap<String,
>>>>>>>>>> String>();
>>>>>>>>>>         for (int i = 0; i < numberOfProperties; i++)
>>>>>>>>>>             properties.put("property" + i, "value" + i);
>>>>>>>>>>         OrientEdge e = ((OrientVertex)v1).addEdge(null,
>>>>>>>>>> (OrientVertex)v2, "E", null, properties);
>>>>>>>>>>         e.save();
>>>>>>>>>>
>>>>>>>>>> return e;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> Results are attached for remote and embedded (both using plocal
>>>>>>>>>> storage type).
>>>>>>>>>> On Monday I'll try to make my conclusions.
>>>>>>>>>>
>>>>>>>>>> All changes are committed to github project.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  --
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>> Google Groups "OrientDB" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>>> send an email to [email protected].
>>>>>>>>>>
>>>>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>> Andrey Lomakin.
>>>>>>>>>
>>>>>>>>> Orient Technologies
>>>>>>>>> the Company behind OrientDB
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>> Andrey Lomakin.
>>>>>>>>
>>>>>>>> Orient Technologies
>>>>>>>> the Company behind OrientDB
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Andrey Lomakin.
>>>>>>>
>>>>>>> Orient Technologies
>>>>>>> the Company behind OrientDB
>>>>>>>
>>>>>>>   --
>>>>>>
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "OrientDB" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Andrey Lomakin.
>>>>>
>>>>> Orient Technologies
>>>>> the Company behind OrientDB
>>>>>
>>>>>  --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "OrientDB" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> http://about.me/milen
>>>>
>>>> --
>>>>
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "OrientDB" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Andrey Lomakin.
>>>
>>> Orient Technologies
>>> the Company behind OrientDB
>>>
>>>   --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "OrientDB" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/orient-database/QscZaIK5JPU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: [orientdb] Re: 1.7rc2 performance results

Reply via email to