You should really update to a newer version of Neo4j. With 2.2 you get also visual query plan profiling, that should help you a lot. most of your queries create way too much intermediate data.
Perhaps also get some hands on consulting / help for writing your queries. Michael Some tips inline > Am 10.05.2015 um 23:58 schrieb Arun Kumar <[email protected]>: > > Michael, > > Thanks for looking in to this.. > > We use Neo4j as recommendation engine... We have movies, classifieds services > listed in our site.. We recommend movies or classifieds to our customers > based on their browsing behaviors... Below are some of the CQL's, we use.. > > 1. Movie recommendation CQL.. > > start > n=node:node_auto_index('customerId:*'),n1=node:node_auto_index(customerId = > {customerId}) > MATCH n-[:LIKES]->movie where movie.Language = {language} and > movie.MOVIE_STATUS={status} > with DISTINCT movie as mov,n1 where not(n1-[:LIKES]->mov) > with mov as movDet, count(mov) as movCt return > movCt,movDet.Movie,movDet.Language order by movCt desc > -> this query creates a huge cross product -> you create a row for each customer and every movie that they ever liked which might be millions I would rather lookup the movie by language and status and follow the likes relationship backwards -> the WHERE NOT is an expensive operation for each pair -> you create too many paths in between you should use DISTINCT or an aggregation (even in your query) -> you should add a limit to the result -> how many movies do you have in the database? > start movie=node:node_auto_index('language:EN > status:ACTIVE'),n1=node:node_auto_index(customerId = {customerId}) > MATCH n-[:LIKES]->movie > with distinct movie as mov,n1 > where not(n1-[:LIKES]->mov) > with mov as movDet, count(mov) as movCt > return movCt,movDet.Movie,movDet.Language > order by movCt desc > >>> Below query is used to identify the members language.. -> this query is not correct as the count would always be 1 you don't aggregate by the same data you group by > > start n=node:node_auto_index(customerId = {customerId}) > MATCH n-[:LIKES]->movie return movie.Language,count(movie.Language) order by > count(movie.Language) desc > > 2. Other recommendation CQL.. This is another cross-product query. What is "a" a movie or actor? you should make sure that the lookup of a is done via a schema index make sure that the minimum a are selected. > > start n=node:node_auto_index(customerId = > 'a899573d-3555-4c9d-ac1b-3f070a7decd7') > MATCH a WHERE NOT (n)-[:LIKES]->(a) and > a.categoryId='26' and a.State='New Jersey' and a.PostType='Offering' and > a.POST_STATUS='A' return a.id as postId order by a.Zipcode limit 5 > > make sure to do the same as above. Going over all customers doesn't make sense, just remove the auto-index lookup and make sure you use a label like a:Post and a schema index for the categoryId your result also doesn't make sense as you again aggregate by the same thing you return > 3. Trending CQL.. > > start n=node:node_auto_index('customerId :*') > MATCH a WHERE (n)-[:LIKES]->(a) and a.categoryId={categoryId} return a.id as > postId, count(a) order by count(a) DESC limit 2 > > 4. Identifying last viewed posts.. you forgot the label on your customer/user so the index will not be used, same for "a" > > match (n {customerId : 'a899573d-3555-4c9d-ac1b-3f070a7decd7'})-[:LIKES]->(a) > where a.categoryId='26' and a.POST_STATUS='A' > return a.PostType as PostType,a.Stay as Stay,a.Salary as Salary,a.Age as > Age,a.Language as Language,a.Experience as Experience, > a.State as state order by a.POST_VIEWED_TIME desc limit 1 > > All these queries will be fired for each member when they move across each > page.. At any given point of time we would have 20 members on an average in > the site and get monthly 400K page views.. Not much though... Make sure that your queries first are in the 10-100ms range and don't generate too many database hits. > > I tried increasing the memory as well.. Didn't help. Let me know if my CQL's > are messed up.. > > Thanks, > Arun. > > On Sunday, May 10, 2015 at 12:59:17 PM UTC-4, Michael Hunger wrote: > What are you doing? Can you share the type of workload / queries / code that > you run? > > Which version are you using? > > According toy our messages.log it spends all time trying to free memory > (causing the spike). > >> wrapper.java.maxmemory=800 > -> you forgot to add a suffix here, so you do 800 bytes of heap not 800mb > change to > >> wrapper.java.maxmemory=800M > > 800M heap are ok for smallish use-cases. > > And you should > 1. upgrade to 2.2.x > 2. alternatively use more memory for memory mapping > >> # Default values for the low-level graph engine >> neostore.nodestore.db.mapped_memory=100M >> neostore.relationshipstore.db.mapped_memory=500M >> neostore.relationshipgroupstore.db.mapped_memory=50M >> neostore.propertystore.db.mapped_memory=500M >> neostore.propertystore.db.strings.mapped_memory=250M >> neostore.propertystore.db.arrays.mapped_memory=30M > > > >> Am 10.05.2015 um 16:24 schrieb Arun Kumar <ar...@ <>pragathi.com >> <http://pragathi.com/>>: >> >> Hi, >> >> Neo4j server CPU spikes up to 90% (and higher) as the node size increases to >> 50 MB.. Initially the CPU is well under 15% and suddenly spikes to 90% once >> certain size limit is reached. I have turned OFF the logs as well. >> >> Below is the neo4j size configuration.. >> >> # Default values for the low-level graph engine >> neostore.nodestore.db.mapped_memory=40M >> neostore.relationshipstore.db.mapped_memory=40M >> neostore.propertystore.db.mapped_memory=150M >> neostore.propertystore.db.strings.mapped_memory=70M >> neostore.propertystore.db.arrays.mapped_memory=30M >> >> keep_logical_logs=false >> keep_logical_logs=3 days >> >> Below is the heap size and JVM configuration.. >> # Initial Java Heap Size (in MB) >> wrapper.java.initmemory=800 >> >> # Maximum Java Heap Size (in MB) >> wrapper.java.maxmemory=800 >> >> wrapper.java.additional=-XX:+UseConcMarkSweepGC >> wrapper.java.additional=-XX:+CMSClassUnloadingEnabled >> wrapper.java.additional=-XX:NewRatio=3 >> wrapper.java.additional=-d64 >> wrapper.java.additional=-server >> wrapper.java.additional=-Xss2048k >> wrapper.java.additional=-XX:+UseParNewGC >> >> I have attached message.log .. >> >> Would appreciate any guidance in this issue. >> >> Thanks, >> Arun. >> >> >> >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to neo4j+un...@ <>googlegroups.com <http://googlegroups.com/>. >> For more options, visit https://groups.google.com/d/optout >> <https://groups.google.com/d/optout>. >> <message.log> > > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
