Hi Rik! ...in minutes?
I'd like to understand how I could get closer to that result, though I will try also that library. that's kind of strange for me, cause both using the LOAD CSV functionality from shell, both doing a transaction each time, it looks like I run into a memory heap problem. Why the batch import from shell should be so slower than the batch-import script? Also, I see the importer is flexible enough, but my custom file (adjacnecy list to avoid redundancy) is more than 1GB; if I expand it and make a csv full of redundancy of node-rel-neighbor1, node-rel-neighbor2, it will be much much bigger and i am worried if it could be handled. A question: in rel.csv (https://github.com/jexp/batch-import/tree/20) i read node-id start from 0. Are they temporary id or mandatory? E.g. what if I would like to upload another subgraph in the same db with the batch importer (clearly without overriding the nodes) ? Il giorno martedì 12 agosto 2014 18:46:00 UTC+2, Rik Van Bruggen ha scritto: > > I think you should use the batch importer for this size of a graph. You > will be done in minutes, not hours. > > https://github.com/jexp/batch-import/tree/20 > > Rik > > On Tuesday, August 12, 2014 5:13:39 PM UTC+1, gg4u wrote: >> >> Hello, >> >> here i am trying to upload a massive network: >> 4M nodes, 100M correlations. >> >> having problems of memory and perfomance, I'd like to know if I am doing >> it OK: >> >> 1. >> Before loading the correlations, I wanted to load the nodes. >> >> 2. Set up neo4-wrapper and neo4j.properties as written in >> http://www.neo4j.org/graphgist?d788e117129c3730a042 >> >> with JVM heap set at 4096Mb >> >> with this setting, bulk on 4M nodes failed. >> >> 3. Raised memory min-heap and max-heap to 6144Mb >> Run a test with 100K nodes. >> >> I got: >> Nodes created: 98991 >> Properties set: 197982 >> Labels added: 98991 >> 3438685 ms >> >> Almost an hour for uploading 100K nodes with two properties? >> I thought it should be much faster. >> >> Am I doing smtg wrong? >> this is the importer code I used: >> >> CREATE CONSTRAINT ON (n:MYNODES) ASSERT n.id IS UNIQUE; >> CREATE INDEX ON : n:MYNODES(name); >> >> USING PERIODIC COMMIT 1000 >> LOAD CSV WITH HEADERS FROM 'file:///blablabla.csv' AS line >> FIELDTERMINATOR '\t' >> WITH line, toInt(line.topicId) as id, line.name as name* LIMIT 100000* >> MERGE (n:MYNODES { id: id, name: name }); >> >> >> -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
