Thomas Mueller created OAK-6381:
-----------------------------------
Summary: Improved index analysis tools
Key: OAK-6381
URL: https://issues.apache.org/jira/browse/OAK-6381
Project: Jackrabbit Oak
Issue Type: Improvement
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Fix For: 1.8
It would be good to have more tools to analyze indexes:
* For Lucene indexes, get a histogram of samples (terms). We have
"getFieldInfo", which shows which fields are how common, but we don't have
terms. For example the /oak:index/lucene index contains 1 million fulltext
fields and node names for 1 million nodes, but I wonder why, and what typical
nodes names are, and maybe fulltext for most nodes is actually empty. Maybe a
new method "getTermHistogram(int sampleCount)" or similar
* For property indexes, number of updated nodes per second or so. Right now we
can just analyze the counts per key, but some indexes / keys are very volatile
(see many short lived entries)
* For Lucene indexes, writes per second or so (in MB).
* How indexes are used (approximate read nodes / MB per hours)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)