On 08/20/2015 07:13 AM, Amos Jeffries wrote: >> This part of the email thread was discussing whether the existence of >> admin scripts (rather than various imprecise syntax rules or >> cachemgr.cgi code) should be the primary factor in our decision making. >> How is the above information relevant to that discussion?
> If such admin scripts are based on anything beyond blind guesses it will > be based on that public documentation and dicussion. Thank you for explaining why you talked about Guide and cachemgr.cgi code. I suspect most admin scripts are based on the output received from Squid. You may call that "blind guesses" if you wish; the label is not important in this case. Since there was no public documentation when most of those scripts were hacked together, I do not think we should expect much else, *even* if we fantasize about admins that read guide books and study obscure source code just to parse simple (but non-standard) output. >>>>>>> +class PayloadFormatter >>>>>> ... >>>>>>> + Packable &outBuf; >>>>>>> +}; >>>>>> >>>>>> Since the majority of PayloadFormatter methods and callers are going to >>>>>> assemble and format pieces of text and numbers, I think this should >>>>>> become an ostream. >>>> >>>>> You want to go back to PackableStream now? >>>> >>>> If possible, the new page/report/payload building interface should be >>>> ostream-enabled, not PackableStream-enabled. IIRC, that's what I >>>> sketched a few emails back. >>> >>> So how does the data get from std::ostreambuf to the client if its not >>> going near a Packable (ie StoreEntry). >> >> I do not think ostream should use std::ostreambuf if that is what you >> meant. It should use our own buffer, backed by Packable (or StoreEntry). >> However, cache manager code would not know that and, hence, there should >> be no linking problems associated with that knowledge. >> > > Okay. I think I undertand you there. > > But the statement about cachemgr does not quite click. Formatter and all > thus *is* cache manager (Mgr::) code. I dont see how Mgr:: code can not > know about how its own internals work. > > I assume by "cache manager code" you mean Action and the HTTP > request/reply handling part of Mgr::. ... and Page/Formatter itself. The code that needs to know about StoreEntry is the code that creates a specific ostream object to configure the Page/Formatter object with. Whether that code is inside the Mgr:: namespace is irrelevant to the linking problems you were worried about. >>> You ask to data-copy into the stream buffer, then data-copy a second >>> time to the StoreEntry buffer. >> >> No, I do not. The stream buffer should use StoreEntry as storage [when >> the output goes to Store]. >> >> >>> PackableStream is an ostream child providing StoreEntry as the buffer. >>> So PackableStream data-copies to the StoreEntry buffer. >> >> Yes. >> >> >>> Which would you expect provides better performance: >>> one or two data-copies ? >> >> One copy. >> >> (The design choices here are actually not driven by performance, but by >> the requirement to avoid buffering of huge cache manager responses in >> RAM. However, the same custom ostream buffer design happens to eliminate >> the extra copy as a positive performance side effect). > > > Then I am going back to the PackableStream patch and making a new > iteration that just renames StoreEntryStream and fixes the syntax > things. Patch for that in the other thread as soon as it works. Re-introducing PackableStream sounds good to me. Please note that Actions should not get PackableStream, StoreEntryStream, or ostream as the new dump() argument though! They should get an object with payload/page/report formatting methods. See my Page sketch for an example. > This patch will then have Formatter creating one of those streams > if/when necessary to drop values into. That does not sound quite right: * an ostream would be needed for virtually any Formatter/Page method and, hence, should be created once; * an ostream should be backed by StoreEntry or another complex destination that Formatter/Page should not know about and, hence, it should be created outside Formatter/Page. > Agreed? To avoid misunderstanding: I hesitate reviewing a future patch now, based on a verbal description of changes, especially since "fixes the syntax things" may mean very different things to me and you. Even the specific red flags I noted above may not match your true intent and/or the next patch iteration! >>>> Said that, ostream is the wrong primary interface for assembling >>>> payload/pages/reports. IMO, you should reintroduce ostream capabilities, >>>> but we should not be [going back to] assembling primary >>>> payload/pages/reports using ostream. More on that below. >> Page provides a set of methods for high-level assembly, including: >> >> /// starts recording a list of values >> virtual Page &beginList() = 0; >> /// finishes recording a list of values >> virtual Page &endList() = 0; >> >> /// starts recording a list of values >> virtual Page &beginListItem() = 0; >> /// finishes recording a list of values >> virtual Page &endListItem() = 0; >> >> /// starts recording a key:value pair with a given key >> virtual Page &beginKv(const SBuf &key) = 0; >> /// finishes recording a key:value pair >> virtual Page &endKv() = 0; > That is not usable by code taking ostream&. Therefore not useful to the > code which most needs to use this objects API. Sorry, I do not understand this comment. There should be no Action::dump2 "code taking ostream" if I interpret you correctly. The new Action::dump2 methods should accept Page reference as an argument, not an ostream reference. >> class Page { >> public: >> ... >> /// use this to assemble opaque/atomic values >> ostream &raw(); >> ... >> } >> >> It might be used along these lines: >> >> page.beginKv("statistics_foo").raw() << something.mean() << endKv; >> page.beginkv("worker_id").raw() << "disker" << DiskerId << endKv; > This relies on chaining The above illustrates ostream usage for atomic value assembly. Chaining is a separate issue. The above can be trivially rewritten if chaining is not supported. We can also remove convenience manipulators if we decide they are bad for some reason: page.beginKv("statistics_foo"); page.raw() << something.mean(); page.endKv(); ... > * It is reasonable to expect that worker_id has different syntax for > string and numeric representations in one of the syntaxes added later. I am happy to assume that for the sake of the argument. > That would call for multiple kvPair(name, value) methods based on the > value received here by the << operator. I think that is more likely to call for a multiple value wrapping methods. It is not related to the "key" part of "kv". This is not an argument against chaining or for banning ostream access AFAICT. To address multiple value types, we will add Page::beginValueFoo() and valueBar(...) methods and alike. > * With indentation in the syntax (both current and YAML) the > "something.mean()" output needs a specific prefix applied at the start > of each of its lines. The mean() method returns a double (an average value over some sample). Sorry if that was not clear from the example. beginKv(key) should provide any necessary value prefixes, of course (at least until we need value-type-specific wrappers). > In both those circumstances it is better to pass 'page' to mean() and > ensure that sub-pieces of something.mean(page) are individually prefixed > and syntaxed by page with correct bits according to its internal state. Sorry, but passing Page to a statistics function returning a mean value of some sample does not make sense to me. There is no need to expose statistics code to these formatting issues. However, let's assume a more complicated example where we are not dumping a simple double value but something more complex that really needs to be exposed to Page. How does that more complex code dump a simple double value keyed to some name? Using the same begin/endKey() pair with a "<<" operator as illustrated by the above example. > Also, the bits between << are now free-flow outputs again. Thats a sure > recipe for some report to make an assumption like you do in the example > code itself (with "disker" assuming a string ID). That a particular kv > thing is going to take a string rather than an integer value. The ostream part of the interface is reserved for assembly of atomic values. You argue that it is _possible_ to misuse it for non-atomic values. That statement is true for any interface! Regardless of the interface you provide, it is possible to use it incorrectly. For example, if your interface only accepts string values, it is possible to incorrectly pass it a non-atomic "foo: bar" value. In fact, with a string-only API, it may be arguably harder for reviewers to spot that mistake because no "<<" operator or .raw() call will attract their attention. >> Needless to say, there is some danger in providing such raw access. If a >> developer is not careful, they might introduce a syntax error inside >> what they saw were perfectly fine atomic values. We can protect against >> that and offer automatic escaping of bad characters (among other >> things), but that would require more work. I suspect such protections >> would not necessarily require Page interface changes; if I am right, we >> can delay their implementation until it is clear they are actually needed. > I don't think that is necessary though. So I am leaving it out of these > early patches. Okay? Pronouns "that" and "it" below a 7-line paragraph make it difficult for me to guess what exactly you are leaving out. Not adding extra layer of value protection/wrapping (for now) is the right approach. Using string-based interfaces for string values is fine. Avoiding ostream interfaces where they help (or introducing barriers to their future addition) is not a good idea. > The current free-form outputs already place indentation requirements the > mixed methods + ostream Page API cannot meet. Its either ostream with > Action::dump() doing the indentation explicitly, or just the > Page/Formatter methods. > > YAML adds extra construct syntax requirements that make it even more > nasty. Then any potential binary outputs we want to generate later will > die horribly. Any format has atomic values. Ostream is the right interface for assembling those atomic values, especially where everything has to eventually go into a StoreEntry-backed stream or alike. There is no reason to avoid ostream where it is useful. Please note that I am not saying that everything has to be formatted using ostream. It is just an optional part of the interface, used where it is helpful, ignored where it is not. >>> What I'm not getting is why you insist this ostream API be exposed and >>> used by Action now but that PackableStream was inappropriate when it >>> could simply have had new members added and been identical to your >>> proposed Formatter. >> First of all, the first patch I was reviewing did not pass >> PackableStream to Actions. It passed ostream. We cannot add methods to >> ostream. > It is not passed to Actions. Its created by Action and passed to > Action::dump() and action sub-code. Yes, "passed to Action::dump()" is what I meant by "passed to Actions". You asserted that PackableStream "could simply have had new members added". That assertion did not apply to your patch because PackableStream was not passed to Action::dump() where those methods were needed. You passed ostream instead. We cannot add methods to ostream. >> Second, adding formal-syntax formatting methods to PackableStream (and >> passing PackableStream to Actions) would be wrong. The PackableStream >> class should be dedicated to providing a Packable-backed ostream. It >> should not know anything about Cache Manager reports and their >> formatting. We may use PackableStream outside Cache Manager. > Its only uses were as an output serializer (formatter) for Action dump() > code. The slightly higher-order syntax of what the fields were was being > left in Action::dump() and related functions where it is in current trunk. > > We agreed to use Page/Formatter to allow markup injection between fields > now, so re-coding was not necessary later. I answered your "why" question with two specific reasons/explanations. It feels like you are now arguing about something else. I do not know what you are arguing about. > In either model Action::dump is still responsible for knowledge of what > field types are being output. Either in the Page/Formatter methods its > calling, or the particular stream << sequencing. I do not know which two models you are talking about, but yes, the new Action::dump2 would know about payload/page/report structure in virtually any reasonable implementation. > As I mentioned above exposing operator<< to Action just means important > parts of the markup injection is not possible. That assertion does not compute for me: How can giving Action and _additional_ and _optional_ tool make something impossible?! >> I hope that Page/Formatter does not need to know about Packable, just >> ostream, but there may be some corner cases I am not aware of. > If Formatter is allocating the PackableStream it would. But > PackableStream is an wholly inline thing for exactly that type of reason. I do not know why Page/Formatter would need to allocate PackableStream. Ideally, Page/Formatter should not know exactly where its output is going. If possible, Page/Formatter should just accept ostream (or, if really needed, PackableStream) from its caller/creator/configurator. >>>>>>> + // a comment >>>>>>> + virtual void notice(const SBuf &) = 0; >>>>>> >>>>>> Let's try to define the purpose of this method more precisely so that we >>>>>> know when it is being used [in]correctly. >> For example, the following are "bad" notices (Squid should not generate >> them): >> >> uptime: 1 hour >> built with libecap v1.1 >> reconfiguration finished in 5 seconds >> error: I/O error while responding to a cache manager request >> >> and the following are "good" notices: >> >> reconfiguration finished >> unsupported action >> unknown cache manager output format > I think we agree on intention. But neither have a good text for it yet. > > It's that passing-on property we need to clearly state. I've used > "display" there since its variable by context and loose enough to cover > any type of that action. > > > How about just a terse: > > notice() - "An informational text announcement. To be passed on instead > of ignored. Do not automatically parse the text." We should not document what should not be ignored. "Not ignoring" is the assumed default, of course. "Passed on" to what or to whom? If I am an admin looking at the notice on my terminal, I do not need to pass it on further. How about this: notice(): Text containing potentially useful information for admins. Expected to be treated as a single opaque string by automation tools. BTW, if we want to make admins and support folks life easier, I would actually add a notice ID as a notice parameter and require that all notices with the same ID mean the same thing, regardless of the text (which we may change). Without an explicit ID, the text itself essentially becomes an ID, which makes it difficult to adjust it. > comment() - "An informational text comment. May be ignored or dropped. > Do not automatically parse the text." This one is much easier, IMO (unless we disagree what this method should be used for): comment(): A comment as defined by the format syntax. Usually contains information meant for developers. Expected to be discarded by automation tools. >>>> Going forward, please do not use "final" unless really necessary. Treat >>>> it like we treat "throw()" declarations. >> >> >>> You mean spotted liberally around the new code in the form of a wrapper >>> keyword like Must() ? >> >> No, I mean avoided like the now-deprecated "throw()" exception >> specification: >> >> http://en.cppreference.com/w/cpp/language/except_spec >> http://www.gotw.ca/publications/mill22.htm >> >> Proposed rule of thumb: If correct inheritance or overriding is >> possible, do not use "final", even if you do not see any good reasons to >> inherit or override. >> > > Aha. Though AFAIK correct inhertence is *always* possible in reasonable > code. So that is just a ban on "final". It depends on how you define "correct". Some say[1] that there is no _correct_ way to subclass std::vector, for example. Since this is just a rule of thumb, I am not sure we need to make it more precise. [1] http://stackoverflow.com/questions/6806173/subclass-inherit-standard-containers In practice, I think it is used for low-level stuff that plays dirty tricks with memory or devices. FWIW, STL implementation in GCC v4.8 has about a dozen "final" classes and structures. All very low-level memory allocation and shared pointers stuff AFAICT -- I do not claim to understand exactly why those complex classes are final! Here is one easy-to-understand example of where final would be useful: https://akrzemi1.wordpress.com/2012/09/30/why-make-your-classes-final/ > As you saw my uses of it are kind of "political" in nature. To enforce > better code use and graceful deprecations of widely used things. I do not think final should be used for graceful deprecation. Final should not imply "dying" or "soon to go away". And even if something is planned to be removed, it may still make sense to subclass or override it (all other alternatives can be worse than extending deprecated code). Cheers, Alex. _______________________________________________ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev