Re: [Neo4j] 1 Billion nodes DB: slow performance when connecting

Michael Hunger Tue, 10 Mar 2015 01:23:17 -0700

Hmm,

ya probably it pulls all nodes from disk, puts them into the cache and 
aggregates them :(


Not an operation you want to do with cypher anyway or Neo4j for that matter :)

But it could cheat internally, that's true.

Michael



> Am 10.03.2015 um 03:32 schrieb Lorenzo Speranzoni (@inserpio) 
> <[email protected]>:
> 
> I use SSD on a Debian Server with every sort configuration still at its 
> default value.
> 
> Index creation was fast: 524ms.
> 
> Count is very slow and it doesn't use any index:
> 
> Planner COST
> 
> 
> 
> EagerAggregation
> 
>   |
> 
>   +AllNodesScan
> 
> 
> 
> +------------------+---------------+------------+------------+-------------+-------+
> 
> |         Operator | EstimatedRows |       Rows |     DbHits | Identifiers | 
> Other |
> 
> +------------------+---------------+------------+------------+-------------+-------+
> 
> | EagerAggregation |         31623 |          1 |          0 |    count(n) |  
>      |
> 
> |     AllNodesScan |    1000000000 | 1000000000 | 1000000001 |           n |  
>      |
> 
> +------------------+---------------+------------+------------+-------------+-------+
> 
> 
> 
> Total database accesses: 1000000001
> 
> 
> 
> Il giorno martedì 10 marzo 2015 00:55:04 UTC+1, Michael Hunger ha scritto:
> Can you try to get a thread dump for the connecting?
> 
> Also did you run this on an SSD or HDD? SSD should be much faster! I took 
> only 40s for me with 12 cores. Maxing out an SSD raid at 1G/s.
> 
> The index will take a while, that's true
> 
> And the cypher statement will pull through 1bn nodes into memory from disk 
> (so again it is mostly disk speed).
> I'd probably use match (n) return count(*)
> 
> Michael
> 
> 
> On Tuesday, March 10, 2015 at 12:41:12 AM UTC+1, Lorenzo Speranzoni 
> (@inserpio) wrote:
> Hi All,
> 
> I'm testing performances on a 1 Billion nodes database.
> 
> To create nodes I used Michael Hunger's code here: 
> https://gist.github.com/jexp/0ff850ab2ce41c9ca5e6 
> <https://gist.github.com/jexp/0ff850ab2ce41c9ca5e6>
> 
> The import when really well: it took 5m 22s 235ms
> 
> But when I try to connect to the just created database via shell, I took 
> really long time.
> That the command I run;
> 
> JAVA_OPTS="-Xmx16G -Xms16G -server -d64" ./neo4j-shell -path /path/to/db/
> 
> I also takes a long time to count the overall nodes, even with an index:
> 
> neo4j-sh (?)$ create index on :Person(id);
> 
> +-------------------+
> 
> | No data returned. |
> 
> +-------------------+
> 
> Indexes added: 1
> 
> 524 ms
> 
> 
> 
> neo4j-sh (?)$ match (n) return count(n);         
> 
> +------------+
> 
> | count(n)   |
> 
> +------------+
> 
> | 1000000000 |
> 
> +------------+
> 
> 1 row
> 
> 430730 ms
> 
> 
> 
> Thanks in advance for the help!
> 
> Lorenzo
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] 1 Billion nodes DB: slow performance when connecting

Reply via email to