please try to format your code so that it is easier to read:
MERGE(first:{TYPE1} {{id:'{val1}'}})
MERGE (second:{TYPE2} {{id:'{val2}'}})
MERGE (first)-[r:{RTYPE}]->(second) ON CREATE SET r.weight={weight_set} ON
MATCH SET {weight_compute}
WITH r
SET r.half_life={half_life},
r.update_time=TIMESTAMP(),
r.threshold={threshold}
WITH r
WHERE r.weight<r.threshold
DELETE r
self.query=neo.CypherQuery(self.graph_db,self.create_rels.
format(TYPE1=entity1[0],val1=entity1[1],
TYPE2=entity2[0],val2=entity2[1],
RTYPE=rel_type,weight_set=weight_set,
weight_compute=CYPHER_WEIGHT_COMPUTE,half_life=half_life,threshold=threshold))
you shouldn't use string.format but use cypher parameters,
do you have constraints an all the label + id combinations you merge on?
how many relationships do you have between the merged entities?
how long does a single statement take?
You config/machine is probably a bit small for a multi-billion size active
graph.
You should definitely increase the mmio settings to
neostore.nodestore.db.mapped_memory=512M
neostore.relationshipstore.db.mapped_memory=2G
neostore.propertystore.db.mapped_memory=1G
neostore.propertystore.db.strings.mapped_memory=128M
neostore.propertystore.db.arrays.mapped_memory=128M
disable node and relationship auto-indexing you don't need it
I think what you see could be:
- full scans due to missing indexes
- garbage collection
- you should profile your query with the neo4j-shell (by prefixing your
query with the real values by "profile") and share the results
Michael
On Mon, Oct 27, 2014 at 4:13 PM, Aileen Agricola <
[email protected]> wrote:
> Hi Liu,
>
> I'm forwarding your question to our google group [email protected]
> Please provide any additional information there.
>
> best,
>
> Aileen Agricola
> Web Program Manager | Neo Technology
> [email protected] | 206.437.2524
>
> *Join us at GraphConnect 2014 SF! graphconnect.com
> <http://graphconnect.com/>*
> *As a friend of Neo4j, use discount code *KOMPIS
> <https://graphconnect2014sf.eventbrite.com/?discount=KOMPIS>* for $100 off
> registration*
>
>
> On Mon, Oct 27, 2014 at 8:08 AM, LIU Xiaobing <[email protected]> wrote:
>
>> Hi experts,
>> Now I encountering one performance problem about neo4j. I try to
>> write some data to neo4j, the data scale is about billions and the
>> relationships between the data is just people-people. When I use py2neo to
>> query and write data to neo4j, i found that it's very slow.
>> The query clause i use:
>> create_rels = 'MERGE(first:{TYPE1} {{id:'{val1}'}}) MERGE
>> (second:{TYPE2} {{id:'{val2}'}}) MERGE (first)-[r:{RTYPE}]->(second) ON
>> CREATE SET r.weight={weight_set} ON MATCH SET {weight_compute} WITH r SET
>> r.half_life={half_life},r.update_time=TIMESTAMP(),r.threshold={threshold}
>> WITH r WHERE r.weight<r.threshold DELETE r'
>>
>> self.query=neo.CypherQuery(self.graph_db,self.create_rels.format(TYPE1=entity1[0],val1=entity1[1],TYPE2=entity2[0],val2=entity2[1],RTYPE=rel_type,weight_set=weight_set,weight_compute=CYPHER_WEIGHT_COMPUTE,half_life=half_life,threshold=threshold))
>> self.query.execute()
>>
>> the CYPHER_WEIGHT_COMPUTE definition is
>> "r.weight=r.weight+r.weight*EXP((TIMESTAMP()-r.update_time)/(r.half_life*1.0))"
>>
>> The purpose of the clause is that the nodes and relationships will be
>> created when the nodes are not in graph db and the properties of
>> relationships will be update if they are.
>> I have tried such ways to gain the performance, but it didn't work
>> well.
>> 1) configure the configure file of server
>> neo4j-wrapper.conf:
>> wrapper.java.initmemory=4096
>> wrapper.java.maxmemory=4096
>> wrapper.java.minmemory=4096
>>
>> neo4j.properties
>> neostore.nodestore.db.mapped_memory=256M
>> neostore.relationshipstore.db.mapped_memory=256M
>> neostore.propertystore.db.mapped_memory=256M
>> neostore.propertystore.db.strings.mapped_memory=128M
>> neostore.propertystore.db.arrays.mapped_memory=128M
>>
>> node_auto_indexing=true
>> relationship_auto_indexing=true
>>
>> 2) Create constraints of the properties of nodes in order to create
>> indexes
>> Cypher clause: create constraint on (n:UID) assert n.id IS UNIQUE
>>
>> When i check the load of server who's equipped with 16 4-core
>> processors, i found that the cpu's load is very high while the network and
>> io's load is not. Does Cypher clause is cpu-greedy? How can i dig the
>> performance using other ways? Thanks very much.
>>
>> By the way, the version of neo4j is 2.1.5 stable verion, version of
>> client py2neo is 1.1.6, RAM of the server is 8G
>>
>>
>> --
>> Best Regards
>> LIU Xiaobing 刘小兵
>>
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.