[Neo4j] Re: Why is Neo4j slower(totally dead) with many nodes and relationships in lower specification of pc/notebook while MySQL is not?

Lundin Fri, 28 Mar 2014 06:20:19 -0700

ms, it is milliseconds.

What is the corresponding result for a SQL db ?
MATCH (n:User)-[:Friend*3]-(FoFoF) return FoFoF;


Albeit a valid search is it something useful ? I would think finding a 
specific persons FoFoF in either end, as a starting point or end point, 
would be a very realistic scenario. Adding an Index on User:name and query 
for a User with name:Rio try to find his FoFoF.

Yes, neo4j has been kind and exposed various function, like shortestpath in 
cypher
http://docs.neo4j.org/refcard/2.0/

Also look at some gist examples
https://github.com/neo4j-contrib/graphgist/wiki

Den fredagen den 28:e mars 2014 kl. 05:00:22 UTC+1 skrev Rio Eduardo:
>
> Thank you so much for the reply Lundin. I really apreciate it. Okay, 
> yesterday I just tested my experiment again. And the result was not what I 
> imagined and expected before. Okay, before I tested *1M* users, I reduced 
> the number of users into *1000* users and tested it not in my social 
> network but directly in database only(Neo4j Shell) to find out that it was 
> not caused by the performance of pc. But the result of returning *1000*users 
> was *200ms 
> and 1 row* and the result of returning *friends at depth of two* was *85000ms 
> and 2500 rows* and are *200ms* and *85000ms* fast to you? and what does 
> *ms* stand for? is it *milliseconds* or *microseconds*?
>
> the query I use for returning *1000* users is
>
> *MATCH (U:User) RETURN COUNT(U);*
> and the query I use for returning *friends at depth of two* is
>
>
> *MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)WHERE 
> U.user_id=1 AND FFU.user_id<>U.user_id AND NOT (U)-[:Friend]->(FFU)RETURN 
> FFU.username*
>
> Please note that I tested with default configuration of Neo4j and created 
> users with *1000* random nodes and created friends relationships with 
> *50000* random relationships(*1* user has *50* friends). Each 
> relationship has a label Friend and no properties on it. Each node has a 
> label User, 4 properties: user_id, username, password and profile_picture. 
> Each property has a value of 1-60 characters. average of characters of 
> user_id=1-1000 characters, all usernames have 10 characters randomly, all 
> passwords have 60 characters because I MD5 it, and profile_picture has 1-60 
> characters.
>
> And about your statement "Otherwise if you really need to present that 
> many "things" just paging the result with SKIP,LIMIT. I has never made 
> sense to present 1M of anything at a time for a user.", I already did 
> according to your statement above but it is still the same, Neo4j returns 
> result slower.
>
> And I'm wondering if Neo4j already applied one of graph 
> algorithms(shortest path, djikstra, A*, etc) in its system or not.
>
> Thank you.
>
>
> On Friday, March 28, 2014 3:43:49 AM UTC+7, Lundin wrote:
>>
>> Rio, any version will do. They can all handle million nodes on common 
>> hardware, no magic at all. When hundred of millions of billions then we 
>> might need to look into specfication more in detail. But in that case with 
>> that kind of data there are other bottlencks for a social network or any 
>> web appp that needs to be taken care of as well.
>>
>> you said:
>>
>>>  Given any two persons chosen at random, is there a path that connects 
>>> them that is at most five relationships long? For a social network 
>>> containing 1,000,000 people, each with approximately 50 friends, the 
>>> results strongly suggest that graph databases are the best choice for 
>>> connected data. And graph database can still work 150 times faster than 
>>> relational database at third degree and 1000 times faster at fourth degre
>>>
>>
>> I fail to see how this is connected to your attempt to list 1M users in 
>> one go at the first page. You would want to seek if there is a relationship 
>> and return that path between users. You need two start nodes and seek a 
>> path by traveser the relationsip rather than scan tables and that would be 
>> the comparison.
>> Otherwise if you really need to present that many "things" just paging 
>> the result with SKIP,LIMIT. I has never made sense to present 1M of 
>> anything at a time for a user. Again, that wouldn't really serve your 
>> experiment much good to prove graph theory.
>>
>> What is the result of MATCH(U:User) RETURN count(U); ?
>>
>> Also when you do your test make sure to add the warm/cold cache effect 
>> (better/worse performance)
>>
>> Den torsdagen den 27:e mars 2014 kl. 17:57:10 UTC+1 skrev Rio Eduardo:
>>>
>>> I just knew about memory allocation and just read Server Performance 
>>> Tuning of Neo4j. neo4j.properties:
>>> # Default values for the low-level graph engine
>>>
>>> #neostore.nodestore.db.mapped_memory=25M
>>> #neostore.relationshipstore.db.mapped_memory=50M
>>> #neostore.propertystore.db.mapped_memory=90M
>>> #neostore.propertystore.db.strings.mapped_memory=130M
>>> #neostore.propertystore.db.arrays.mapped_memory=130M
>>>
>>> Should I change this to get high performance? If yes, please suggest me.
>>>
>>> And I just knew about Neo4j Licenses, they are Community, Personal, 
>>> Startups, Business and Enterprise. And at Neo4j website all features are 
>>> explained. So which Neo4j should I use for my case that has millions nodes 
>>> and relationships?
>>>
>>> Please answer. I need your help so much.
>>>
>>> Thanks.
>>>
>>> On Tuesday, March 25, 2014 12:03:58 AM UTC+7, Rio Eduardo wrote:
>>>>
>>>> I'm testing my thesis which is about transforming from relational 
>>>> database to graph database. After transforming from relational database to 
>>>> graph database, I will test their own performance according to query 
>>>> response time and throughput. In relational database, I use MySQL while in 
>>>> graph database I use Neo4j for testing. I will have 3 Million more nodes 
>>>> and 6 Million more relationships. But when I just added 60000 nodes, my 
>>>> Neo4j is already dead. When I tried to return all 60000 nodes, it returned 
>>>> unknown. I did the same to MySQL, I added 60000 records but it could 
>>>> return 
>>>> all 60000 records. It's weird because it's against the papers I read that 
>>>> told me graph database is faster than relational database So Why is Neo4j 
>>>> slower(totally dead) in lower specification of pc/notebook while MySQL is 
>>>> not? And What specification of pc/notebook do I should use to give the 
>>>> best 
>>>> performance during testing with millions of nodes and relationships?
>>>>
>>>> Thank you.
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[Neo4j] Re: Why is Neo4j slower(totally dead) with many nodes and relationships in lower specification of pc/notebook while MySQL is not?

Reply via email to