GOSSIP-17

chandresh pancholi Mon, 12 Dec 2016 06:31:16 -0800

Hi,

I have added Histogram metric in ActiveGossip for sendSharedData,
sendPerNodeData, sendMembership.


Link : https://github.com/chandresh-pancholi/incubator-gossip

Could someone checkout the code and review it before I finish it for other
metrics ? I want to know whether I am moving in the right direction with
the code.





On Tue, Oct 11, 2016 at 11:50 PM, Edward Capriolo <[email protected]>
wrote:

> I would say a few things. There are a lot of things going on in the
> software that are interesting.
>
> We have several queues and thread pools.
>
> It makes sense to put
> http://metrics.dropwizard.io/3.1.0/getting-started/#gauges around those.
> This will give us visibility as to how close those are to 0 at any given
> time.
>
> We now have per-node data:
>
> https://issues.apache.org/jira/browse/GOSSIP-21
> https://issues.apache.org/jira/browse/GOSSIP-25
>
> It makes sense to use gauges to record the size of these. We should also
> use meters to count how operations/sec are caused by users adding data as
> well as the internode process replicating data.
>
> For PassiveGossipThread I could see us counting messages received as a
> meter. We could corrupt messages separately as a meter. We could aslo
> capture this data per host:
>
> gossipfrom.node1.goodmessages
> gossipfrom.node1.badmessages
>
> As well as globally
>
> gossipfrom.badmessages
> gossipfrom.goodmessages
>
> For ActiveGossip we could use histograms to track the time to process
>
> sendSharedData
> sendPerNodeData
> sendMembership
>
> We could use a gauge to track the size of this.scheduledExecutorService =
> Executors.newScheduledThreadPool(2); and other executors tom make sure
> that
> that queue is not backing up/blocked. Again you can track this per host and
> globally
>
> I am an ex-system administrator so I am generally ok with as many metrics
> as possible as long as we do not clutter the code. There are ways to do
> aspect/annotation driven counters as well so we can always look to refactor
> around those things if we want to.
>
> If you see something that seems like a point of possible contention or
> something that you believe is important to track I would capture that. In
> the long run there is something to consider about tracking metrics from 1k
> node clusters but we are not there yet and metrics is generally lighter
> than the code anyway.
>
> Thanks for taking the time to look at this.
> Edward
>
>
>
>
>
> On Tue, Oct 11, 2016 at 2:04 PM, chandresh pancholi <
> [email protected]> wrote:
>
> > Hi,
> >
> > I wanted to know where to begin working on this issue.
> > Someone please help me out with where to start and how to proceed with
> it.
> >
> > For Histogram i see ActiveThreadGroup and PassiveThreadGroup are doing
> > inter-node operation.
> >
> > Where are we tracking success and failure request so generate meter
> > metrics?
> >
> > Any kind of help is appreciable.
> >
> > --
> > Chandresh Pancholi
> > Senior Software Engineer
> > Flipkart.com
> > Email-id:[email protected]
> > Contact:08951803660
> >
>



-- 
Chandresh Pancholi
Senior Software Engineer
Flipkart.com
Email-id:[email protected]
Contact:08951803660

Re: [GOSSIP-17] https://issues.apache.org/jira/browse/GOSSIP-17

Reply via email to