Re: Expose a metric for percentage-recovered during full recoveries
Hi S G, This looks useful, and it should be easy to add to the existing metrics in ReplicationHandler, probably somewhere around ReplicationHandler:856 . > On 14 Mar 2018, at 20:16, S Gwrote: > > Hi, > > Solr does full recoveries very frequently - sometimes even for seemingly > simple cases like adding a field to the schema, a couple of nodes go into > recovery. > It would be nice if it did not do such full recoveries so frequently but > since that may require a lot of fixing, can we have a metric that reports > how much a core has recovered already? > > Example: > > $ cd data > $ du -h . | grep my_collection | grep -w index > 77G ./my_collection_shard3_replica2/data/index.20180314184942993 > 145G ./my_collection_shard3_replica2/data/index.20180112001943687 > > This shows that the shard3-replica2 core is doing a full recovery and has > only copied 77G out of 145G > That is about 50% recovery done. > > > It would be very nice if we can have this as a JMX metric and we can then > plot it somewhere instead of having to keep running the same command in a > loop and guessing how much is left to be copied. > > A metric like the following would be great: > { >"my_collection_shard3_replica2": { > "recovery": { > "currentSize": "77 gb", > "expectedSize": "145 gb", > "percentRecovered": "50", > "startTimeEpoch": "361273126317" > } >} > } > > If it looks useful, I will open a JIRA for the same. > > Thanks > SG
Re: Expose a metric for percentage-recovered during full recoveries
S Were there errors in the logs just before recoveries? Rick -- Sorry for being brief. Alternate email is rickleir at yahoo dot com
Expose a metric for percentage-recovered during full recoveries
Hi, Solr does full recoveries very frequently - sometimes even for seemingly simple cases like adding a field to the schema, a couple of nodes go into recovery. It would be nice if it did not do such full recoveries so frequently but since that may require a lot of fixing, can we have a metric that reports how much a core has recovered already? Example: $ cd data $ du -h . | grep my_collection | grep -w index 77G ./my_collection_shard3_replica2/data/index.20180314184942993 145G ./my_collection_shard3_replica2/data/index.20180112001943687 This shows that the shard3-replica2 core is doing a full recovery and has only copied 77G out of 145G That is about 50% recovery done. It would be very nice if we can have this as a JMX metric and we can then plot it somewhere instead of having to keep running the same command in a loop and guessing how much is left to be copied. A metric like the following would be great: { "my_collection_shard3_replica2": { "recovery": { "currentSize": "77 gb", "expectedSize": "145 gb", "percentRecovered": "50", "startTimeEpoch": "361273126317" } } } If it looks useful, I will open a JIRA for the same. Thanks SG