Hi,

I have noticed OAK-4638 and OAK-4412 – which both deal with particular 
problematic aspects of property indexes. I realise that both issues deal with 
slightly different problems and hence come to different suggested solutions.
But still I felt it would be good to take a holistic view on the different 
problems with property indexes. Maybe there is a unified approach we can take.

To my knowledge there are 3 areas where property indexes are problematic or not 
ideal:

1. Number of nodes: Property indexes can create a large number of nodes. For 
properties that are very common the number of index nodes can be almost as 
large as the number of the content nodes. A large number of nodes is not 
necessarily a problem in itself, but if the underlying persistence is e.g. 
MongoDB then those index nodes (i.e. MongoDB documents) cause pressure on 
MongoDB’s mmap architecture which in turn affects reading content nodes.

2. Write performance: when the persistence (i.e. MongoDB) and Oak are “far away 
from each other” (i.e. high network latency or low throughput) then synchronous 
property indexes affect the write throughput as they may cause the payload to 
double in size.

3. I have no data on this one – but think it might be a topic: property index 
updates usually cause commits to have / as the commit root. This results on 
pressure on the root document.

Please correct me if I got anything wrong  or inaccurate in the above.

My point is, however, that at the very least we should have clarity which one 
go the items above we intend to tackle with Oak improvements. Ideally we would 
have a unified approach.
(I realize that property indexes come in various flavours like unique index or 
not, which makes the discussion more complex)

my2c
Michael

Reply via email to