INDEXES INDEXES INDEXES!!! Create them for nodes BEFORE attempting to
import.
120k nodes --- without indexes HOURS!, 120k nodes w/ indexes UNDER A
MINUTE!
On Saturday, October 18, 2014 1:14:21 AM UTC-4, Rodger wrote:
>
> Dear Experts,
>
> I have about 88K Person nodes.
>
> match (x)
> return labels(x), count(*)
>
> +-----------------------+
> | labels(x) | count(*) |
> +-----------------------+
> | ["Person"] | 87474 |
> +-----------------------+
> 1 row
> 531 ms
>
>
> I'm using LOAD CSV to create about 360K relationships.
>
> USING PERIODIC COMMIT
> LOAD CSV WITH HEADERS FROM "file:///home/notes/enron/mailgraph2.csv" AS
> csvLine
> MATCH ( from:Person { personid: toInt(csvLine.senderid ) }), ( to:Person
> { personid: toInt(csvLine.recipientid )})
> CREATE ( from ) -[ :AGG_EMAILS
> {
> total_to: toInt ( csvLine.total_to )
> , total_cc: toInt ( csvLine.total_cc )
> , total_bcc: toInt ( csvLine.total_bcc )
> , total_emails: toInt ( csvLine.total_emails )
> }
> ]-> ( to )
> ;
>
>
> But I calculate that only 2.5 relationships are created per second after I
> periodically count from another session:
>
>
> match (from) -[r]-> (to) return count(*) ;
> +----------+
> | count(*) |
> +----------+
> | 8001 |
> +----------+
> 1 row
> 200 ms
>
>
>
> I tried correcting the autoindexing (see other thread),
> deleting the relationships, and started the load again.
>
> But the load is still running at about the same speed.
>
> Is 2.5 inserts/creations of relationship per second typical?
>
>
> Thanks a lot!
>
>
>
>
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.