I'm already stuck for a couple of days with a performance problem, or even 
queries that do not produce predictable results. I posted this one before, 
but now, after upgrading the machine from 2 to 8GB (hoping it was a lack of 
memory issue) , I don't know how to solve this. 

*Environment  : *
Cloudserver with : 8GB Ram 40GB SSD Disk Ubuntu 13.10 x64
Version : Neo4j 2.1.0-M01

*Settings adapted from defaults*
in neo4j-wrapper.conf
wrapper.java.initmemory=4096
wrapper.java.maxmemory=4096

in neo4j.properties
neostore.nodestore.db.mapped_memory=50M
neostore.relationshipstore.db.mapped_memory=100M
neostore.propertystore.db.mapped_memory=180M
neostore.propertystore.db.strings.mapped_memory=260M
neostore.propertystore.db.arrays.mapped_memory=260M

*Server info:*
HeapMemoryUsage
committed 4260102144
init 4294967296
max 4260102144
used 302304640

NonHeapMemoryUsage
committed 64909312
init 24313856
max 136314880
used 54561464

*Size of database*
Primitive count
NumberOfRelationshipIdsInUse 1294128
NumberOfNodeIdsInUse 55746
NumberOfPropertyIdsInUse 55036
NumberOfRelationshipTypeIdsInUse 11

*Model*

about 10k :jurt  (= some kind of document) nodes and 12k :Term nodes, with 
about 1.2M (jurt)-[:HAS_TERM]->(term) rels

*Use*

Find similar jurts based on number of common terms, like this:

match (j1:jurt)-[:HAS_TERM]->(t:Term)<-[:HAS_TERM]-(j2:jurt)
where NOT j1=j2 AND j1.jurt_id = {jurtid}
with j1,j2,count(t) as commonterms
return j1.jurt_id,j2.jurt_id,commonterms
order by commonterms desc
limit 3

the query above takes  5-6 secs, way too long I guess.  

*Query plan*

{
  "columns" : [ "j1.jurt_id", "j2.jurt_id", "commonterms" ],
  "data" : [ [ "J70000", "J72191", 68 ], [ "J70000", "J73483", 67 ], [ 
"J70000", "J72924", 66 ] ],
  "plan" : {
    "args" : {
      "returnItemNames" : [ "j1.jurt_id", "j2.jurt_id", "commonterms" ],
      "_rows" : 3,
      "_db_hits" : 0,
      "symKeys" : [ "j1", "j2.jurt_id", "j1.jurt_id", "commonterms", "j2" ]
    },
    "dbHits" : 0,
    "name" : "ColumnFilter",
    "children" : [ {
      "args" : {
        "limit" : "Literal(3)",
        "orderBy" : [ "SortItem(commonterms,false)" ],
        "_rows" : 3,
        "_db_hits" : 0
      },
      "dbHits" : 0,
      "name" : "Top",
      "children" : [ {
        "args" : {
          "_rows" : 9992,
          "_db_hits" : 19984,
          "exprKeys" : [ "j1.jurt_id", "j2.jurt_id" ],
          "symKeys" : [ "j1", "j2", "commonterms" ]
        },
        "dbHits" : 19984,
        "name" : "Extract",
        "children" : [ {
          "args" : {
            "returnItemNames" : [ "j1", "j2", "commonterms" ],
            "_rows" : 9992,
            "_db_hits" : 0,
            "symKeys" : [ "j1", "j2", " 
 INTERNAL_AGGREGATE8b273443-699b-4262-8a48-41d7a316fa44" ]
          },
          "dbHits" : 0,
          "name" : "ColumnFilter",
          "children" : [ {
            "args" : {
              "keys" : [ "j1", "j2" ],
              "_rows" : 9992,
              "aggregates" : [ "( 
 INTERNAL_AGGREGATE8b273443-699b-4262-8a48-41d7a316fa44,Count(t))" ],
              "_db_hits" : 0
            },
            "dbHits" : 0,
            "name" : "EagerAggregation",
            "children" : [ {
              "args" : {
                "_rows" : 478380,
                "_db_hits" : 0,
                "pred" : "(NOT(j1 == j2) AND hasLabel(j2:jurt(3)))"
              },
              "dbHits" : 0,
              "name" : "Filter",
              "children" : [ {
                "args" : {
                  "start" : {
                    "identifiers" : [ "j1" ],
                    "query" : "{jurtid}",
                    "producer" : "SchemaIndex",
                    "property" : "jurt_id",
                    "label" : "jurt"
                  },
                  "trail" : "(j1)-[  UNNAMED15:HAS_TERM WHERE 
(hasLabel(NodeIdentifier():Term(1)) AND hasLabel(NodeIdentifier():Term(1))) 
AND true]->(t)<-[  UNNAMED37:HAS_TERM WHERE 
hasLabel(NodeIdentifier():jurt(3)) AND true]-(j2)",
                  "_rows" : 478380,
                  "_db_hits" : 478518
                },
                "dbHits" : 478518,
                "name" : "TraversalMatcher",
                "children" : [ ],
                "rows" : 478380
              } ],
              "rows" : 478380
            } ],
            "rows" : 9992
          } ],
          "rows" : 9992
        } ],
        "rows" : 9992
      } ],
      "rows" : 3
    } ],
    "rows" : 3
  }
}



Every time I change the value of the  {jurtid}  param, the first run of the 
query returns different results then the subsequent runs

When I provide two params, like this:

match (j1:jurt)-[:HAS_TERM]->(t:Term)<-[:HAS_TERM]-(j2:jurt) 
where NOT j1=j2 AND ((j1.jurt_id = {id1}) OR (j1.jurt_id = {id2})) 
with j1,j2,count(t) as commonterms 
return j1.jurt_id,j2.jurt_id,commonterms 
order by commonterms desc limit 3

it doesn't even return anything in the shell in the browser I get an 
"Unknown error"


-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to