Hi, I have noticed OAK-4638 and OAK-4412 – which both deal with particular problematic aspects of property indexes. I realise that both issues deal with slightly different problems and hence come to different suggested solutions. But still I felt it would be good to take a holistic view on the different problems with property indexes. Maybe there is a unified approach we can take.
To my knowledge there are 3 areas where property indexes are problematic or not ideal: 1. Number of nodes: Property indexes can create a large number of nodes. For properties that are very common the number of index nodes can be almost as large as the number of the content nodes. A large number of nodes is not necessarily a problem in itself, but if the underlying persistence is e.g. MongoDB then those index nodes (i.e. MongoDB documents) cause pressure on MongoDB’s mmap architecture which in turn affects reading content nodes. 2. Write performance: when the persistence (i.e. MongoDB) and Oak are “far away from each other” (i.e. high network latency or low throughput) then synchronous property indexes affect the write throughput as they may cause the payload to double in size. 3. I have no data on this one – but think it might be a topic: property index updates usually cause commits to have / as the commit root. This results on pressure on the root document. Please correct me if I got anything wrong or inaccurate in the above. My point is, however, that at the very least we should have clarity which one go the items above we intend to tackle with Oak improvements. Ideally we would have a unified approach. (I realize that property indexes come in various flavours like unique index or not, which makes the discussion more complex) my2c Michael