Re: Review Request 70783: Added debug logging for metrics which are slow to become ready.

Greg Mann Wed, 05 Jun 2019 08:42:38 -0700


> On June 5, 2019, 1:05 a.m., Benjamin Mahler wrote:
> > 3rdparty/libprocess/src/metrics/metrics.cpp
> > Lines 199-207 (patched)
> > <https://reviews.apache.org/r/70783/diff/1/?file=2147733#file2147733line199>
> >
> >     If we consolidate the logging between user timeout and slow timeout can 
> > we have one path?
> >     
> >     ```
> >       Future<Nothing> waited = <all futures ready OR user timeout OR slow 
> > timeout>;
> >     
> >       return waited
> >         .then(defer(self(), &Self::_snapshotHandler, ...));
> >     }
> >       
> >     _snapshotHandler(...)
> >     {
> >       // LOG(INFO) not ready futures after timeout (user or built-in SLOW 
> > one)
> >       ...
> >         
> >       Future<Nothing> waited = <all futures ready OR user timeout>;
> >     
> >       return waited
> >         .then(defer(self(), &Self::__snapshotHandler, ...));
> >     }
> >       
> >     __snapshotHandler(...)
> >     {
> >       ...
> >       return response;
> >     }
> >     ```
> >     
> >     We would log those not ready after either the user provided one, or the 
> > built-in slow one, whichever is smaller (I guess we can use a min() for 
> > that too if we want). The downside here seems to be that if a user 
> > specifies a small one and still lots of pull gauges are used, will lead to 
> > a lot of INFO logging. Not sure how much we should worry about this case.
> >     
> >     We should probably name the endpoint handler functions and snapshot map 
> > function differently or house the handler ones in an 'Http' sub-struct, a 
> > bit confusing right now.


I had considered a similar option previously but ended up opting for a more 
minimal diff. I wrote a new solution along these lines and although the patch 
is now much more complex, I think the end result is cleaner. The complexity 
largely arises from the fact that if the user specifies a timeout longer than 
the hard-coded timeout, then we can potentially execute the logging code twice. 
Let me know what you think.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70783/#review215693
-----------------------------------------------------------


On June 5, 2019, 3:39 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70783/
> -----------------------------------------------------------
> 
> (Updated June 5, 2019, 3:39 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Joseph Wu, and Vinod 
> Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added debug logging for metrics which are slow to become ready.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/include/process/metrics/metrics.hpp 
> 75711edbaf46797e5eb54ba720ea11cf3de81522 
>   3rdparty/libprocess/src/metrics/metrics.cpp 
> 623d44adbe838f995ddbe89ee26f5bcc9c600be5 
>   3rdparty/libprocess/src/tests/metrics_tests.cpp 
> 881275693e67f3c9fb670c7e70cb5014090ed7a5 
> 
> 
> Diff: https://reviews.apache.org/r/70783/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Greg Mann
> 
>

Re: Review Request 70783: Added debug logging for metrics which are slow to become ready.

Reply via email to