Hi,

How much memory does the machine have?

The machine has 64g of memory, so I think I can increase my page cache. But 
I should have at least twice this memory to be able to load the whole graph 
in the page cache. 
In my use case, as Solr only contains a subset of the FOLDER nodes (about 
100000 nodes), I was thinking of executing a query that selects these 
100000 nodes at start, for warming up the cache and to be sure that the 
page cache contains (at least) these nodes. Will they be evicted of the 
page cache after a certain amount of time ?

Which properties of the nodes do you need to be returned? the full nodes?

Yes, the full nodes have to be returned. They contain 1 oid (String), 1 
property 'name' (String), 4 boolean properties used as flags for business 
tasks and 2 long properties (creation and modification date)

Thank you,
Vincent

On Tuesday, January 30, 2018 at 3:04:50 AM UTC+1, Michael Hunger wrote:
>
> Hi,
> this query should be better:
>
> match(node : FOLDER) where node.oid IN {uuidList} return node
>
> You have definitely a really bad system for this graph size:
> How much memory does the machine have?
>
> 0. Switch to Neo4j Enterprise 3.3.2 which is more memory efficient
> 1. *use an SSD*
> 2. use more memory
> 3. use a constraint instead of an index
>
> Otherwise you are effectively measuring disk speed.
>
> The problem is that the nodes might be distributed across the disk and 
> then it might have to load up to 200 pages with the HDD having to seek to 
> each of the blocks.
>
> Which properties of the nodes do you need to be returned? the full nodes?
>
>
> On Mon, Jan 29, 2018 at 5:11 PM, Vincent Mooser <[email protected] 
> <javascript:>> wrote:
>
>> Hi,
>> I am currently facing some performance problems when loading nodes using 
>> an indexed UUID. My use case is the following:
>>
>> - I initiate a search query in Apache Solr which returns a list of 200 
>> UUID (max)
>> - I load the 200 nodes corresponding to the uuid with the following 
>> cypher:
>>
>> unwind {uuidList} as uuid
>> match(node : FOLDER { oid : uuid}) return node
>>
>> The uuidList is a query param containing the list of UUID (string)
>>
>> When the query has no page fault, it takes about 10-20ms to load the 200 
>> nodes. But when some page faults appears in the query log, the query time 
>> can take up to 4 seconds. I understand that some nodes have to be loaded 
>> directly from the disk, but for 200 nodes, it looks very slow to me.
>>
>> The FOLDER nodes are organized  like folders in a filesystem and are 
>> attached together with a 'PARENT' relationship. The only folder that does 
>> not have any parent is the root folder.
>>
>> Environment specs are:
>> - 300M nodes 
>> - 600M relationships
>> - 110M nodes with the label 'FOLDER'
>> - all FOLDER nodes have a property 'oid' which index is online
>> - the graph.db directory is about 125g (without transaction logs)
>> - neo4j enterprise 3.2.6 and java driver 1.4.4
>> - 8g of Heap
>> - 32g of page cache
>> - no SSD
>>
>> Any hints for improving performances ?
>>
>> Thank you
>> Vincent
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to