My revised script:

USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM "file://localhost/home/deployer/tblMfr.csv" AS
csvLine
    FIELDTERMINATOR '\t'
CREATE (vendor:GraphVendor { vendor_code_id: toInt(csvLine.Mfr_Code_ID),
vendor_id: toInt(csvLine.Mfr_ID), vendor_name: csvLine.Mfr_Name,
vendor_abbreviation: csvLine.Mfr_Abbr, vendor_status: csvLine.Mfr_Status })
WITH toInt(csvLine.Mfr_ID) as vendor_id
MATCH (vendor:GraphVendor { vendor_id: vendor_id})
MATCH (part:GraphPart {mfr_id: vendor_id})
MERGE (part)-[:MANUFACTURED_BY]->(vendor);

:schema

Indexes
  ON :GraphPart(mfr_id)      ONLINE
  ON :GraphPart(part_id)     ONLINE (for uniqueness constraint)
  ON :GraphVendor(vendor_id) ONLINE (for uniqueness constraint)

Constraints
  ON (graphpart:GraphPart) ASSERT graphpart.part_id IS UNIQUE
  ON (graphvendor:GraphVendor) ASSERT graphvendor.vendor_id IS UNIQUE


The import still spends significant time in GC:
2014-08-28 14:18:30.559+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 27270ms [total
block time: 111.254s]
2014-08-28 14:18:58.105+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 25724ms [total
block time: 136.978s]
2014-08-28 14:19:19.571+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 19982ms [total
block time: 156.96s]
2014-08-28 14:19:47.826+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 26533ms [total
block time: 183.493s]
2014-08-28 14:19:48.088+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 162ms [total
block time: 183.655s]
2014-08-28 14:20:16.149+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 26880ms [total
block time: 210.535s]
2014-08-28 14:20:37.432+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 20102ms [total
block time: 230.637s]
2014-08-28 14:21:06.477+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 27755ms [total
block time: 258.392s]
2014-08-28 14:21:06.907+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 330ms [total
block time: 258.722s]
2014-08-28 14:21:35.483+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 27721ms [total
block time: 286.443s]
2014-08-28 14:21:57.764+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 21430ms [total
block time: 307.873s]
2014-08-28 14:22:27.172+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 28122ms [total
block time: 335.995s]
2014-08-28 14:22:27.613+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 340ms [total
block time: 336.335s]
2014-08-28 14:22:56.549+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 28191ms [total
block time: 364.526s]
2014-08-28 14:23:18.865+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 21591ms [total
block time: 386.117s]
2014-08-28 14:23:44.941+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 25330ms [total
block time: 411.447s]
2014-08-28 14:23:45.415+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 374ms [total
block time: 411.821s]
2014-08-28 14:24:13.505+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 27449ms [total
block time: 439.27s]
2014-08-28 14:24:33.630+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 19595ms [total
block time: 458.865s]
2014-08-28 14:25:00.748+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 26693ms [total
block time: 485.558s]
2014-08-28 14:25:01.247+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for an additional 398ms [total
block time: 485.956s]


I've got 8GB memory set for the JVM, should I increase this to 12? Also
would it help if I turned on GC loggin and posted those logs?


On Wed, Aug 27, 2014 at 5:14 PM, Michael Hunger <
[email protected]> wrote:

> Chris,
>
> your cypher query seems to be wrong:
>
> 1. split it up into node creation and relationship creation
> 2. use bigger transaciton sizes
> 3. you forgot a colon before :GraphPart so it doesn't use an index for
> that one
> 4. you don't have do use the path and foreach a simple match is good enough
>
> USING PERIODIC COMMIT 10000
>
> LOAD CSV WITH HEADERS FROM "file://localhost/home/deployer/tblMfr.csv" AS
> csvLine
>     FIELDTERMINATOR '\t'
> CREATE (vendor:GraphVendor { vendor_code_id: toInt(csvLine.Mfr_Code_ID),
> vendor_id: toInt(csvLine.Mfr_ID), vendor_name: csvLine.Mfr_Name,
> vendor_abbreviation: csvLine.Mfr_Abbr, vendor_status: csvLine.Mfr_Status });
>
>
> create index on :GraphVendor(vendor_id);
>
> USING PERIODIC COMMIT 10000
>
> LOAD CSV WITH HEADERS FROM "file://localhost/home/deployer/tblMfr.csv" AS
> csvLine
>     FIELDTERMINATOR '\t'
>
> WITH toInt(csvLine.Mfr_ID) as vendor_id
>
> MATCH (vendor:GraphVendor { vendor_id: vendor_id})
> MATCH (part:GraphPart {mfr_id: vendor_id})
> MERGE (part)-[:MANUFACTURED_BY]->(vendor);
>
>
>
> Am 26.08.2014 um 23:33 schrieb Chris G <[email protected]>:
>
> Group, I'm trying to wrap me head around the memory configuration for
> Neo4j.
>
> I've got ~4 million parts that I have loaded and indexed via cypher and
> have these indexes:
>
> Indexes
>   ON :GraphPart(mfr_id)  ONLINE
>   ON :GraphPart(part_id) ONLINE (for uniqueness constraint)
>
> Constraints
>   ON (graphpart:GraphPart) ASSERT graphpart.part_id IS UNIQUE
>
>
>
> Now I want to import my vendors via this cypher:
>
> USING PERIODIC COMMIT 1
> LOAD CSV WITH HEADERS FROM "file://localhost/home/deployer/tblMfr.csv" AS
> csvLine
>     FIELDTERMINATOR '\t'
> CREATE (vendor:GraphVendor { vendor_code_id: toInt(csvLine.Mfr_Code_ID),
> vendor_id: toInt(csvLine.Mfr_ID), vendor_name: csvLine.Mfr_Name,
> vendor_abbreviation: csvLine.Mfr_Abbr, vendor_status: csvLine.Mfr_Status })
> WITH vendor
> MATCH p = (GraphPart {mfr_id: vendor.vendor_id})
> FOREACH (n IN nodes(p) | MERGE (n)-[r:MANUFACTURED_BY]->(vendor))
>
>
> I have configured the conf files:
>
> neo4j.properties:
> neostore.nodestore.db.mapped_memory=50M
> neostore.relationshipstore.db.mapped_memory=500M
> neostore.propertystore.db.mapped_memory=100M
> neostore.propertystore.db.strings.mapped_memory=130M
> neostore.propertystore.db.arrays.mapped_memory=0M
>
> neo4j-wrapper.conf:
>
> wrapper.java.initmemory=4096
> wrapper.java.maxmemory=12288
>
>
> even with 12G heap and PERIODIC COMMIT *1 *messages.log looks like this:
> 2014-08-26 21:14:08.936+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 719ms [total block
> time: 16.227s]
> 2014-08-26 21:14:10.874+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 1630ms [total block
> time: 17.857s]
> 2014-08-26 21:14:12.377+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 673ms [total block
> time: 18.53s]
> 2014-08-26 21:14:13.715+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 719ms [total block
> time: 19.249s]
> 2014-08-26 21:14:15.424+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 1400ms [total block
> time: 20.649s]
> 2014-08-26 21:14:16.924+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 754ms [total block
> time: 21.403s]
> 2014-08-26 21:14:18.146+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 908ms [total block
> time: 22.311s]
> 2014-08-26 21:14:19.881+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 1207ms [total block
> time: 23.518s]
> 2014-08-26 21:14:21.551+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 1033ms [total block
> time: 24.551s]
> 2014-08-26 21:14:22.801+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 827ms [total block
> time: 25.378s]
> 2014-08-26 21:14:49.154+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 26040ms [total block
> time: 51.418s]
> 2014-08-26 21:14:49.524+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 270ms [total block
> time: 51.688s]
> 2014-08-26 21:15:24.662+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 32772ms [total block
> time: 84.46s]
> 2014-08-26 21:15:51.122+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 26039ms [total block
> time: 110.499s]
> 2014-08-26 21:16:24.233+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 32902ms [total block
> time: 143.401s]
> 2014-08-26 21:16:50.232+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 25898ms [total block
> time: 169.299s]
> 2014-08-26 21:17:20.085+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 29753ms [total block
> time: 199.052s]
> 2014-08-26 21:17:46.225+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 26040ms [total block
> time: 225.092s]
> 2014-08-26 21:21:04.960+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC
> Monitor: Application threads blocked for an additional 29433ms [total block
> time: 254.525s]
>
>
> Could anyone suggest what I can try next, or some alternative memory
> settings?
>
> I'm trying to get proof of concept up and running so I can present this to
> my bosses.
>
> I hope I am missing something simple, if not I think it's time for Neo4j
> to invest in some canonical documentation on how to configure neo4j memory
> usage, There are sparse mentions in the user guide, but most of what I find
> related to performance comes from blog posts, stack overflow questions, and
> mailing list posts (most of which Michael Hunger is answering). I also hope
> once I get past these initial memory settings the rest of neo4j will just
> work.
>
> Thanks for reading,
>
> Chris
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
>
> For more options, visit https://groups.google.com/d/optout.
>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "Neo4j" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/neo4j/rOr8tL1r-R8/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/d/optout.
>



-- 
CR

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to