Hi, I have added Histogram metric in ActiveGossip for sendSharedData, sendPerNodeData, sendMembership.
Link : https://github.com/chandresh-pancholi/incubator-gossip Could someone checkout the code and review it before I finish it for other metrics ? I want to know whether I am moving in the right direction with the code. On Tue, Oct 11, 2016 at 11:50 PM, Edward Capriolo <[email protected]> wrote: > I would say a few things. There are a lot of things going on in the > software that are interesting. > > We have several queues and thread pools. > > It makes sense to put > http://metrics.dropwizard.io/3.1.0/getting-started/#gauges around those. > This will give us visibility as to how close those are to 0 at any given > time. > > We now have per-node data: > > https://issues.apache.org/jira/browse/GOSSIP-21 > https://issues.apache.org/jira/browse/GOSSIP-25 > > It makes sense to use gauges to record the size of these. We should also > use meters to count how operations/sec are caused by users adding data as > well as the internode process replicating data. > > For PassiveGossipThread I could see us counting messages received as a > meter. We could corrupt messages separately as a meter. We could aslo > capture this data per host: > > gossipfrom.node1.goodmessages > gossipfrom.node1.badmessages > > As well as globally > > gossipfrom.badmessages > gossipfrom.goodmessages > > For ActiveGossip we could use histograms to track the time to process > > sendSharedData > sendPerNodeData > sendMembership > > We could use a gauge to track the size of this.scheduledExecutorService = > Executors.newScheduledThreadPool(2); and other executors tom make sure > that > that queue is not backing up/blocked. Again you can track this per host and > globally > > I am an ex-system administrator so I am generally ok with as many metrics > as possible as long as we do not clutter the code. There are ways to do > aspect/annotation driven counters as well so we can always look to refactor > around those things if we want to. > > If you see something that seems like a point of possible contention or > something that you believe is important to track I would capture that. In > the long run there is something to consider about tracking metrics from 1k > node clusters but we are not there yet and metrics is generally lighter > than the code anyway. > > Thanks for taking the time to look at this. > Edward > > > > > > On Tue, Oct 11, 2016 at 2:04 PM, chandresh pancholi < > [email protected]> wrote: > > > Hi, > > > > I wanted to know where to begin working on this issue. > > Someone please help me out with where to start and how to proceed with > it. > > > > For Histogram i see ActiveThreadGroup and PassiveThreadGroup are doing > > inter-node operation. > > > > Where are we tracking success and failure request so generate meter > > metrics? > > > > Any kind of help is appreciable. > > > > -- > > Chandresh Pancholi > > Senior Software Engineer > > Flipkart.com > > Email-id:[email protected] > > Contact:08951803660 > > > -- Chandresh Pancholi Senior Software Engineer Flipkart.com Email-id:[email protected] Contact:08951803660
