Re: [Neo4j] Re: The performance about neo4j

Michael Hunger Wed, 05 Nov 2014 09:16:23 -0800

please try to format your code so that it is easier to read:

MERGE(first:{TYPE1} {{id:'{val1}'}})
MERGE (second:{TYPE2} {{id:'{val2}'}})
MERGE (first)-[r:{RTYPE}]->(second) ON CREATE SET r.weight={weight_set} ON
MATCH SET {weight_compute}
WITH r
SET r.half_life={half_life},
    r.update_time=TIMESTAMP(),
    r.threshold={threshold}
WITH r
WHERE r.weight<r.threshold
DELETE r


self.query=neo.CypherQuery(self.graph_db,self.create_rels.
     format(TYPE1=entity1[0],val1=entity1[1],
            TYPE2=entity2[0],val2=entity2[1],
            RTYPE=rel_type,weight_set=weight_set,

weight_compute=CYPHER_WEIGHT_COMPUTE,half_life=half_life,threshold=threshold))

you shouldn't use string.format but use cypher parameters,

do you have constraints an all the label + id combinations you merge on?
how many relationships do you have between the merged entities?
how long does a single statement take?

You config/machine is probably a bit small for a multi-billion size active
graph.

You should definitely increase the mmio settings to

    neostore.nodestore.db.mapped_memory=512M
    neostore.relationshipstore.db.mapped_memory=2G
    neostore.propertystore.db.mapped_memory=1G
    neostore.propertystore.db.strings.mapped_memory=128M
    neostore.propertystore.db.arrays.mapped_memory=128M

disable node and relationship auto-indexing you don't need it

I think what you see could be:

- full scans due to missing indexes
- garbage collection
- you should profile your query with the neo4j-shell (by prefixing your
query with the real values by "profile") and share the results

Michael


On Mon, Oct 27, 2014 at 4:13 PM, Aileen Agricola <
[email protected]> wrote:

> Hi Liu,
>
> I'm forwarding your question to our google group  [email protected]
> Please provide any additional information there.
>
> best,
>
> Aileen Agricola
> Web Program Manager | Neo Technology
> [email protected] | 206.437.2524
>
> *Join us at GraphConnect 2014 SF! graphconnect.com
> <http://graphconnect.com/>*
> *As a friend of Neo4j, use discount code *KOMPIS
> <https://graphconnect2014sf.eventbrite.com/?discount=KOMPIS>* for $100 off
> registration*
>
>
> On Mon, Oct 27, 2014 at 8:08 AM, LIU Xiaobing <[email protected]> wrote:
>
>> Hi experts,
>>     Now I encountering one performance problem about neo4j. I try to
>> write some data to neo4j, the data scale is about billions and the
>> relationships between the data is just people-people. When I use py2neo to
>> query and write data to neo4j, i found that it's very slow.
>>     The query clause i use:
>>     create_rels = 'MERGE(first:{TYPE1} {{id:'{val1}'}}) MERGE
>> (second:{TYPE2} {{id:'{val2}'}}) MERGE (first)-[r:{RTYPE}]->(second) ON
>> CREATE SET r.weight={weight_set} ON MATCH SET {weight_compute} WITH r SET
>> r.half_life={half_life},r.update_time=TIMESTAMP(),r.threshold={threshold}
>> WITH r WHERE r.weight<r.threshold DELETE r'
>>  
>> self.query=neo.CypherQuery(self.graph_db,self.create_rels.format(TYPE1=entity1[0],val1=entity1[1],TYPE2=entity2[0],val2=entity2[1],RTYPE=rel_type,weight_set=weight_set,weight_compute=CYPHER_WEIGHT_COMPUTE,half_life=half_life,threshold=threshold))
>>     self.query.execute()
>>
>>     the CYPHER_WEIGHT_COMPUTE definition is
>> "r.weight=r.weight+r.weight*EXP((TIMESTAMP()-r.update_time)/(r.half_life*1.0))"
>>
>>     The purpose of the clause is that the nodes and relationships will be
>> created when the nodes are not in graph db and the properties of
>> relationships will be update if they are.
>>     I have tried such ways to gain the performance, but it didn't work
>> well.
>>     1) configure the configure file of server
>>     neo4j-wrapper.conf:
>>     wrapper.java.initmemory=4096
>>     wrapper.java.maxmemory=4096
>>     wrapper.java.minmemory=4096
>>
>>     neo4j.properties
>>     neostore.nodestore.db.mapped_memory=256M
>>     neostore.relationshipstore.db.mapped_memory=256M
>>     neostore.propertystore.db.mapped_memory=256M
>>     neostore.propertystore.db.strings.mapped_memory=128M
>>     neostore.propertystore.db.arrays.mapped_memory=128M
>>
>>     node_auto_indexing=true
>>     relationship_auto_indexing=true
>>
>>     2) Create constraints of the properties of nodes in order to create
>> indexes
>>        Cypher clause: create constraint on (n:UID) assert n.id IS UNIQUE
>>
>>     When i check the load of server who's equipped with 16 4-core
>> processors, i found that the cpu's load is very high while the network and
>> io's load is not. Does Cypher clause is cpu-greedy? How can i dig the
>> performance using other ways? Thanks very much.
>>
>> By the way, the version of neo4j is 2.1.5 stable verion, version of
>> client py2neo is 1.1.6, RAM of the server is 8G
>>
>>
>> --
>> Best Regards
>> LIU Xiaobing 刘小兵
>>
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Re: The performance about neo4j

Reply via email to