[Neo4j] Re: Why is Neo4j slower(totally dead) with many nodes and relationships in lower specification of pc/notebook while MySQL is not?

Rio Eduardo Fri, 28 Mar 2014 19:33:07 -0700

Thank you for the reply Ludnin.
And I already tried MATCH (n:User)-[:Friend*3]-(FoFoF) return FoFoF; and 
the result is still the same that is about 85000ms.
I just realized that was my mistake. I always thought that the new feature 
Labels already applied Index or Constraint so I had never created Index or 
Constraint when I was using cypher.
And after I created the constraint I got what I expected, see my reply for 
Michael's reply below.


and thanks for the references.

On Friday, March 28, 2014 8:08:40 PM UTC+7, Lundin wrote:
>
> ms, it is milliseconds.
>
> What is the corresponding result for a SQL db ?
> MATCH (n:User)-[:Friend*3]-(FoFoF) return FoFoF;
>
> Albeit a valid search is it something useful ? I would think finding a 
> specific persons FoFoF in either end, as a starting point or end point, 
> would be a very realistic scenario. Adding an Index on User:name and query 
> for a User with name:Rio try to find his FoFoF.
>
> Yes, neo4j has been kind and exposed various function, like shortestpath 
> in cypher
> http://docs.neo4j.org/refcard/2.0/
>
> Also look at some gist examples
> https://github.com/neo4j-contrib/graphgist/wiki
>
> Den fredagen den 28:e mars 2014 kl. 05:00:22 UTC+1 skrev Rio Eduardo:
>>
>> Thank you so much for the reply Lundin. I really apreciate it. Okay, 
>> yesterday I just tested my experiment again. And the result was not what I 
>> imagined and expected before. Okay, before I tested *1M* users, I 
>> reduced the number of users into *1000* users and tested it not in my 
>> social network but directly in database only(Neo4j Shell) to find out that 
>> it was not caused by the performance of pc. But the result of returning 
>> *1000* users was *200ms and 1 row* and the result of returning *friends 
>> at depth of two* was *85000ms and 2500 rows* and are *200ms* and 
>> *85000ms* fast to you? and what does *ms* stand for? is it *milliseconds*or 
>> *microseconds*?
>>
>> the query I use for returning *1000* users is
>>
>> *MATCH (U:User) RETURN COUNT(U);*
>> and the query I use for returning *friends at depth of two* is
>>
>>
>> *MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)WHERE 
>> U.user_id=1 AND FFU.user_id<>U.user_id AND NOT (U)-[:Friend]->(FFU)RETURN 
>> FFU.username*
>>
>> Please note that I tested with default configuration of Neo4j and created 
>> users with *1000* random nodes and created friends relationships with 
>> *50000* random relationships(*1* user has *50* friends). Each 
>> relationship has a label Friend and no properties on it. Each node has a 
>> label User, 4 properties: user_id, username, password and profile_picture. 
>> Each property has a value of 1-60 characters. average of characters of 
>> user_id=1-1000 characters, all usernames have 10 characters randomly, all 
>> passwords have 60 characters because I MD5 it, and profile_picture has 1-60 
>> characters.
>>
>> And about your statement "Otherwise if you really need to present that 
>> many "things" just paging the result with SKIP,LIMIT. I has never made 
>> sense to present 1M of anything at a time for a user.", I already did 
>> according to your statement above but it is still the same, Neo4j returns 
>> result slower.
>>
>> And I'm wondering if Neo4j already applied one of graph 
>> algorithms(shortest path, djikstra, A*, etc) in its system or not.
>>
>> Thank you.
>>
>>
>> On Friday, March 28, 2014 3:43:49 AM UTC+7, Lundin wrote:
>>>
>>> Rio, any version will do. They can all handle million nodes on common 
>>> hardware, no magic at all. When hundred of millions of billions then we 
>>> might need to look into specfication more in detail. But in that case with 
>>> that kind of data there are other bottlencks for a social network or any 
>>> web appp that needs to be taken care of as well.
>>>
>>> you said:
>>>
>>>>  Given any two persons chosen at random, is there a path that connects 
>>>> them that is at most five relationships long? For a social network 
>>>> containing 1,000,000 people, each with approximately 50 friends, the 
>>>> results strongly suggest that graph databases are the best choice for 
>>>> connected data. And graph database can still work 150 times faster than 
>>>> relational database at third degree and 1000 times faster at fourth degre
>>>>
>>>
>>> I fail to see how this is connected to your attempt to list 1M users in 
>>> one go at the first page. You would want to seek if there is a relationship 
>>> and return that path between users. You need two start nodes and seek a 
>>> path by traveser the relationsip rather than scan tables and that would be 
>>> the comparison.
>>> Otherwise if you really need to present that many "things" just paging 
>>> the result with SKIP,LIMIT. I has never made sense to present 1M of 
>>> anything at a time for a user. Again, that wouldn't really serve your 
>>> experiment much good to prove graph theory.
>>>
>>> What is the result of MATCH(U:User) RETURN count(U); ?
>>>
>>> Also when you do your test make sure to add the warm/cold cache effect 
>>> (better/worse performance)
>>>
>>> Den torsdagen den 27:e mars 2014 kl. 17:57:10 UTC+1 skrev Rio Eduardo:
>>>>
>>>> I just knew about memory allocation and just read Server Performance 
>>>> Tuning of Neo4j. neo4j.properties:
>>>> # Default values for the low-level graph engine
>>>>
>>>> #neostore.nodestore.db.mapped_memory=25M
>>>> #neostore.relationshipstore.db.mapped_memory=50M
>>>> #neostore.propertystore.db.mapped_memory=90M
>>>> #neostore.propertystore.db.strings.mapped_memory=130M
>>>> #neostore.propertystore.db.arrays.mapped_memory=130M
>>>>
>>>> Should I change this to get high performance? If yes, please suggest me.
>>>>
>>>> And I just knew about Neo4j Licenses, they are Community, Personal, 
>>>> Startups, Business and Enterprise. And at Neo4j website all features are 
>>>> explained. So which Neo4j should I use for my case that has millions nodes 
>>>> and relationships?
>>>>
>>>> Please answer. I need your help so much.
>>>>
>>>> Thanks.
>>>>
>>>> On Tuesday, March 25, 2014 12:03:58 AM UTC+7, Rio Eduardo wrote:
>>>>>
>>>>> I'm testing my thesis which is about transforming from relational 
>>>>> database to graph database. After transforming from relational database 
>>>>> to 
>>>>> graph database, I will test their own performance according to query 
>>>>> response time and throughput. In relational database, I use MySQL while 
>>>>> in 
>>>>> graph database I use Neo4j for testing. I will have 3 Million more nodes 
>>>>> and 6 Million more relationships. But when I just added 60000 nodes, my 
>>>>> Neo4j is already dead. When I tried to return all 60000 nodes, it 
>>>>> returned 
>>>>> unknown. I did the same to MySQL, I added 60000 records but it could 
>>>>> return 
>>>>> all 60000 records. It's weird because it's against the papers I read that 
>>>>> told me graph database is faster than relational database So Why is Neo4j 
>>>>> slower(totally dead) in lower specification of pc/notebook while MySQL is 
>>>>> not? And What specification of pc/notebook do I should use to give the 
>>>>> best 
>>>>> performance during testing with millions of nodes and relationships?
>>>>>
>>>>> Thank you.
>>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[Neo4j] Re: Why is Neo4j slower(totally dead) with many nodes and relationships in lower specification of pc/notebook while MySQL is not?

Reply via email to