Hi S G, This looks useful, and it should be easy to add to the existing metrics in ReplicationHandler, probably somewhere around ReplicationHandler:856 .
> On 14 Mar 2018, at 20:16, S G <sg.online.em...@gmail.com> wrote: > > Hi, > > Solr does full recoveries very frequently - sometimes even for seemingly > simple cases like adding a field to the schema, a couple of nodes go into > recovery. > It would be nice if it did not do such full recoveries so frequently but > since that may require a lot of fixing, can we have a metric that reports > how much a core has recovered already? > > Example: > > $ cd data > $ du -h . | grep my_collection | grep -w index > 77G ./my_collection_shard3_replica2/data/index.20180314184942993 > 145G ./my_collection_shard3_replica2/data/index.20180112001943687 > > This shows that the shard3-replica2 core is doing a full recovery and has > only copied 77G out of 145G > That is about 50% recovery done. > > > It would be very nice if we can have this as a JMX metric and we can then > plot it somewhere instead of having to keep running the same command in a > loop and guessing how much is left to be copied. > > A metric like the following would be great: > { > "my_collection_shard3_replica2": { > "recovery": { > "currentSize": "77 gb", > "expectedSize": "145 gb", > "percentRecovered": "50", > "startTimeEpoch": "361273126317" > } > } > } > > If it looks useful, I will open a JIRA for the same. > > Thanks > SG