Re: [Neo4j] Re: Why is Neo4j slower(totally dead) with many nodes and relationships in lower specification of pc/notebook while MySQL is not?

Rio Eduardo Tue, 25 Mar 2014 10:54:12 -0700

Ok, I should explain about testing of my thesis more detail. I use graph 
database because of my case. My case is social network. Social network 
model is identical to connected data that is similar to graph database 
model. That's why I use graph database instead of others NoSQL's methods. 
So in testing, I will test both relational database and graph database. And 
I will test them both directly in database and through the social network. 
I build social network with PHP(Neo4jPHP). And I'm sorry I just realized 
that I made mistake, I meant I wanted to return all nodes that have label 
"User" => MATCH(U:User) RETURN U. And why do I still want to return all 
nodes that have label "User"? It's because my first page of social network 
I want to display all users that already registered to my social network. 
And other activities like post something, give comment, like post, like 
comment, add as friend and others feature or activity I do with graph 
queries like you did say => What you want to do are graph queries.


And once again it's my mistake or maybe it's misuderstanding. Actually I 
should say I would have 1M Users in my social network not 1M nodes. Why do 
I choose 1M Users in my social network? It's because of Partner and 
Vukotic's experiment:

In their book Neo4j in Action, Partner and Vukotic perform an experiment 
using a relational store and Neo4j. The comparison shows that the graph 
database is substantially quicker for connected data than a relational 
store. Partner and Vukotic’s experiment seeks to find friends-of-friends in 
a social network, to a maximum depth of five. Given any two persons chosen 
at random, is there a path that connects them that is at most five 
relationships long? For a social network containing 1,000,000 people, each 
with approximately 50 friends, the results strongly suggest that graph 
databases are the best choice for connected data. And graph database can 
still work 150 times faster than relational database at third degree and 
1000 times faster at fourth degree. So I have my hypothesis is like this:

My social network containing 1,000,000 people that uses graph database can 
work up to 150 times faster than my social network containing 1,000,000 
people that uses relational database at third degree and up to 1000 times 
faster at fourth degree. And graph database can do seeks to find 
friends-of-friends up to fifth degree.

That's actually I'm doing with my thesis right now.

Please help. Please give me suggestions.

Thanks.

On Tuesday, March 25, 2014 10:11:44 PM UTC+7, Michael Hunger wrote:
>
> If you want to test a graph database, it doesn't make sense to return all 
> data. What you want to do are graph queries.
> Otherwise you can just use any KV-store or filesystem if it is just about 
> returning all the data stored in the database.
>
> Curl is a command line tool for executing http requests on unix systems. 
> It also exists for windows.
>
> http://localhost:7474/db/data/transaction/commit is an http API url, not 
> something you point your browser to
>
> Cheers
>
> Michael
>
>
>
> On Tue, Mar 25, 2014 at 4:53 AM, Rio Eduardo <[email protected]<javascript:>
> > wrote:
>
>> yes I want to return all nodes to testing neo4j. And I just tested it in 
>> higher specification of pc and it said "Resultset too large (over 1000 
>> rows)". And I just tested it again in Neo4j Shell and it said "68351 rows 
>> and 1580 ms" and when I open http://localhost:7474/db/data/
>> transaction/commit<http://www.google.com/url?q=http%3A%2F%2Flocalhost%3A7474%2Fdb%2Fdata%2Ftransaction%2Fcommit&sa=D&sntz=1&usg=AFQjCNHGD4NjsoNAMum_EuZt9-dMZFMp0g>,
>>  
>> it shows me blank page. Please help me how to run this statement => time 
>> curl -o result.json -d'{"statements":[{"statement":"match (n) return 
>> id(n)"}]}' -H accept:application/json -H content-type:application/json.
>>
>> Thank you.
>>
>>
>> On Tuesday, March 25, 2014 5:27:58 AM UTC+7, Michael Hunger wrote:
>>
>>> Why would you want to return all nodes in the first place?
>>>
>>> If you really want to do that, use the transactional http endpoint and 
>>> curl that streams the response:
>>>
>>> I tested it with a db of 100k nodes, it takes 0.9 seconds to transfer 
>>> them (1.5MB) over the wire
>>>
>>> time curl -o result.json -d'{"statements":[{"statement":"match (n) 
>>> return id(n)"}]}' -H accept:application/json -H 
>>> content-type:application/json http://localhost:7474/db/data/
>>> transaction/commit<http://www.google.com/url?q=http%3A%2F%2Flocalhost%3A7474%2Fdb%2Fdata%2Ftransaction%2Fcommit&sa=D&sntz=1&usg=AFQjCNHGD4NjsoNAMum_EuZt9-dMZFMp0g>
>>>
>>>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  
>>> Current
>>>
>>>                                  Dload  Upload   Total   Spent    Left  
>>> Speed
>>>
>>> 100 1552k    0 1552k  100    55  1708k     60 --:--:-- --:--:-- --:--:-- 
>>> 1707k
>>>
>>>
>>> real 0m0.915s
>>>
>>> user 0m0.153s
>>>
>>> sys 0m0.409s
>>>
>>> wuqour:neo4j-enterprise-2.0.1 mh$ ls -lh result.json 
>>>
>>> -rw-r--r--  1 mh  staff   1,5M 24 Mär 23:24 result.json
>>>
>>> If you transfer all their properties by using "return n", it takes 1.4 
>>> seconds and results in 4.1MB transferred.
>>>
>>> If you just want to know how many nodes are in your db. use something 
>>> like this instead:
>>>
>>> match (n) return count(*);
>>> +----------+
>>> | count(*) |
>>> +----------+
>>> | 100052   |
>>> +----------+
>>> 1 row
>>> 186 ms
>>>
>>>
>>>
>>> On Mon, Mar 24, 2014 at 8:46 PM, Javad Karabi <[email protected]>wrote:
>>>
>>>> make sure you are setting up your indexes.
>>>> this was something that i did not do at first, but once i realized how 
>>>> important it was, my queries were incredibly fast.
>>>> also, profile your queries by prepending "profile " to the query, and 
>>>> try to decrease _db_hits.
>>>>
>>>> if you can provide the output of "profile ...", that would be awesome.
>>>>
>>>>
>>>> On Monday, March 24, 2014 12:03:58 PM UTC-5, Rio Eduardo wrote:
>>>>>
>>>>> I'm testing my thesis which is about transforming from relational 
>>>>> database to graph database. After transforming from relational database 
>>>>> to 
>>>>> graph database, I will test their own performance according to query 
>>>>> response time and throughput. In relational database, I use MySQL while 
>>>>> in 
>>>>> graph database I use Neo4j for testing. I will have 3 Million more nodes 
>>>>> and 6 Million more relationships. But when I just added 60000 nodes, my 
>>>>> Neo4j is already dead. When I tried to return all 60000 nodes, it 
>>>>> returned 
>>>>> unknown. I did the same to MySQL, I added 60000 records but it could 
>>>>> return 
>>>>> all 60000 records. It's weird because it's against the papers I read that 
>>>>> told me graph database is faster than relational database So Why is Neo4j 
>>>>> slower(totally dead) in lower specification of pc/notebook while MySQL is 
>>>>> not? And What specification of pc/notebook do I should use to give the 
>>>>> best 
>>>>> performance during testing with millions of nodes and relationships?
>>>>>
>>>>> Thank you.
>>>>>
>>>>  -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Neo4j" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Re: Why is Neo4j slower(totally dead) with many nodes and relationships in lower specification of pc/notebook while MySQL is not?

Reply via email to