----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13603/#review31835 -----------------------------------------------------------
3rdparty/libprocess/include/process/statistics.hpp <https://reviews.apache.org/r/13603/#comment60552> Is this for std::pair? Maybe a comment? 3rdparty/libprocess/include/process/statistics.hpp <https://reviews.apache.org/r/13603/#comment60553> I don't mind the formatting but it's not standard as far as I know. 3rdparty/libprocess/include/process/statistics.hpp <https://reviews.apache.org/r/13603/#comment60550> Any reason not to return a Future<TimeSeries<double>>? 3rdparty/libprocess/include/process/statistics.hpp <https://reviews.apache.org/r/13603/#comment60551> Let's move TimeSeries into it's own header. 3rdparty/libprocess/include/process/statistics.hpp <https://reviews.apache.org/r/13603/#comment60555> Not necessary (and not standard). 3rdparty/libprocess/include/process/statistics.hpp <https://reviews.apache.org/r/13603/#comment60556> What about just doing 'capacity(std::max(3, _capacity))' above instead? I don't think keeping around another 2 instances of T will likely be that big of a deal. 3rdparty/libprocess/include/process/statistics.hpp <https://reviews.apache.org/r/13603/#comment60554> Any reason not to use const& for the parameters? 3rdparty/libprocess/include/process/statistics.hpp <https://reviews.apache.org/r/13603/#comment60557> s/lg/log/ 3rdparty/libprocess/include/process/statistics.hpp <https://reviews.apache.org/r/13603/#comment60558> What about a TimeSeries::Value struct and returning that instead of a pair in order to capture the semantics explicitly? 3rdparty/libprocess/include/process/statistics.hpp <https://reviews.apache.org/r/13603/#comment60560> The comment around the technique makes sense but the implementation could use some comments too, i.e., a description of how you expect the implementation to capture the notion of "halving" would be great. You could use an example of a time series getting bigger but not going over capacity (the while loop conditional) as well as a time series getting bigger and reaching the halfway point. 3rdparty/libprocess/include/process/statistics.hpp <https://reviews.apache.org/r/13603/#comment60559> It would be great to have some more comments and maybe even some CHECKs that capture your expected semantics re: 'next'. For example, the first time this is called you expect 'index' to be none and that's when you set 'next'. Likewise, after a truncation it might be the case that 'next' is invalid and you need to reset it. A comment down by 'Option<size_t> index;' would be good too (saying something like it starts none until the first sparsify call). 3rdparty/libprocess/include/process/statistics.hpp <https://reviews.apache.org/r/13603/#comment60567> The use of 'truncate' here is not great. You're not truncating, you're sparsifing here. 3rdparty/libprocess/src/statistics.cpp <https://reviews.apache.org/r/13603/#comment60561> Formatting? 3rdparty/libprocess/src/statistics.cpp <https://reviews.apache.org/r/13603/#comment60562> Why not statistics[context][name] = ? 3rdparty/libprocess/src/statistics.cpp <https://reviews.apache.org/r/13603/#comment60563> const&? 3rdparty/libprocess/src/tests/statistics_tests.cpp <https://reviews.apache.org/r/13603/#comment60566> By a "round of truncation", do you mean sparsification? IIUC, I would expect that you can add less things in order for sparsification to occur. Perhaps you're being more general here? But to capture the algorithm I think that you could add strictly less. Also, given that some truncation can occur after some time, should you pause the clock for this test? 3rdparty/libprocess/src/tests/statistics_tests.cpp <https://reviews.apache.org/r/13603/#comment60568> s/time-series/time series/ 3rdparty/libprocess/src/tests/statistics_tests.cpp <https://reviews.apache.org/r/13603/#comment60569> Can you test that it's sparsified at this point then? And again, it's sparsification since the clock is paused correct? 3rdparty/libprocess/src/tests/statistics_tests.cpp <https://reviews.apache.org/r/13603/#comment60565> s/time-series/time series/ 3rdparty/libprocess/src/tests/statistics_tests.cpp <https://reviews.apache.org/r/13603/#comment60571> s/time-series/time series/ - Benjamin Hindman On Jan. 14, 2014, 7:08 a.m., Ben Mahler wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/13603/ > ----------------------------------------------------------- > > (Updated Jan. 14, 2014, 7:08 a.m.) > > > Review request for mesos, Benjamin Hindman and Jie Yu. > > > Repository: mesos-git > > > Description > ------- > > The first part of this change is the removal of the archive operation. > > This also exposes a TimeSeries data structure used by Statistics. > > The goal here is to provide reasonable historical information for high > frequency time series that rapidly exceed capacity. > TimeSeries allows a window of values to be stored, while "sparsifying" older > values when the capacity is exceeded. > See the documentation on TimeSeries for details. > > An alternative, more deterministic approach, is to use sampling on older > values. > Both approaches have their advantages and disadvantages. > > > Diffs > ----- > > 3rdparty/libprocess/include/process/statistics.hpp > ce122a5eaa1aa9c09207cc8c9428136b681561cf > 3rdparty/libprocess/src/statistics.cpp > d4ba9f146f7b9b46525a0e27fbfb3d61a21a94fc > 3rdparty/libprocess/src/tests/statistics_tests.cpp > e6c9a1b776d6f3339f96898c3501d2a00c416006 > > Diff: https://reviews.apache.org/r/13603/diff/ > > > Testing > ------- > > Added tests to verify the truncation and sparsification of time series. > > > Thanks, > > Ben Mahler > >
