Re: [PATCH] Native FTP Relay
On 08/08/2014 11:13 AM, Amos Jeffries wrote: > On 9/08/2014 4:57 a.m., Alex Rousskov wrote: >> On 08/08/2014 09:48 AM, Amos Jeffries wrote: >>> Audit results (part 1): >>> * Please apply CharacterSet updates separately. >>> * Please apply Tokenizer API updates separately. >>> * please apply src/HttpHdrCc.h changes separately. >> Why? If the branch is merged, they will be applied with their own/ >> existing commit messages. > They are updates on previous feature changesets unrelated to FTP which > will block future parser alterations (also unrelated to FTP) being > backported if this project takes long to stabilize. > > If they have their own commit messages then cherrypicking out of the > branch should be relatively easy. Thank you for explaining your rationale. To minimize overheads, let's come back to this issue if the branch is not merged soon. >>> * is FTP/1.1 really the protocol version of FTP parsed by Squid? (the >>> AnyP::ProtocolVersion parameters) > Okay, leave as-is but please document that as the reason for using that > version number Done. It was actually documented in the corresponding commit message. To provide _source code_ documentation in one place, I added Ftp::ProtocolVersion() and used that everywhere instead of hard-coded "1,1". It will be easier to track down all "1,1" users that way if somebody wants to change or polish this further later. See ftp/Elements.* The resulting code required more changes than one may expect and is still a little awkward because Http::ProtocolVersion should have been just a function (not a class/type), and because PortCfg::setTransport internals did not belong to AnyP where they were placed. I did my best to avoid cascading changes required to fix all that by leaving Http::ProtocolVersion as is and moving PortCfg::setTransport() internals into cache_cf.cc where they can access FTP-specific code. > and a TODO about cleaning up the message processing to > stop assuming HTTP versions. Done in one of the many client_side.cc places where we still assume HTTP. Here is the new TODO: > /* RFC 2616 section 10.5.6 : handle unsupported HTTP major versions > cleanly. */ > /* We currently only support 0.9, 1.0, 1.1 properly */ > /* TODO: move HTTP-specific processing into servers/HttpServer and such */ > if ( (http_ver.major == 0 && http_ver.minor != 9) || > (http_ver.major > 1) ) { This part of the problem will be gone when HTTP-specific code will be moved to HttpServer, but the ICAP compatibility part will not go away regardless of any cleanup. >>> in src/cf.data.pre: >>> * please document tproxy, name=, protocol=FTP as supported on ftp_port >> We do not know whether it is supported and do not plan on testing/fixing >> that support right now. Do you still want us to document it as supported? > Yes. You have not changed the TcpAcceptor code for TPROXY lookups or the > ACL code using port name. So they are supported. Done. Please note that protocol= requires accel mode, and I am not sure what that mode means for a native FTP relay. I added protocol=FTP anyway in hope to shorten the discussion. > protocol= you explicitly added support in the parser. This is probably a misunderstanding caused by poor official code quality. We did not add explicit support for the protocol= option. The parsing code we added was required for the _default_ ftp_port configuration (no protocol=... option) to work because of the way the general port parsing (and other) official code is written. The official code uses CfgPort::transport for at least two rather different purposes (for determining foo in foo_port and for the protocol= feature). Cleaning this up requires a serious effort well beyond the FTP project scope (the FTP code does not really need the protocol= feature to work). If, after reading the above clarification, you would rather see protocol=FTP gone from squid.conf.documented, I would be more than happy to remove it. > in src/FwdState.cc: > * at line 817 calling request->hier.note() twice in a row with different > values is both needless change and wasting CPU. Fixed? This was my merge bug. The fixed patch uses the more recent line version (and both lines were yours, thank you). > in src/HttpHeader.cc: > * please enact the new TODO about stripping FTP-Arguments "header" on > PASS command. Done to shorten the discussion. > This information leak could earn us a CVE if merged as-is. I do not think it could because that code is currently unused by/for FTP relays (as the added source code comment tried to explain). Native FTP relay does not send Squid "error pages" to the user and those error pages are the only context (that I know of) where the masking code is actually used today. > in src/HttpHeaderTools.cc: > * httpHeaderQuoteString() could be optimized with needInnerQuote as > SBuf::size_type to append in blocks between characters needing quote. Agreed. Added a corresponding TODO. I do not want to volunteer to provi
Re: [PATCH] OAuth 2.0 Bearer authentication
On 08/09/2014 04:19 AM, Amos Jeffries wrote: > On 5/08/2014 3:22 a.m., Alex Rousskov wrote: >> On 07/31/2014 03:29 AM, Amos Jeffries wrote: >>> A garbage collection TTL "cleanup_interval" is configurable and removes >>> cache entries which have been stale for at least 1 hr. >> While some old code still uses periodic cleanup, I think we should avoid >> adding more code like that. Periodic cleanup leads to cache overflows, >> memory usage surprises, and hot idle. Please remove old unneeded cache >> entries when adding a new cache entry instead of checking the cache >> state every hour. > I agree. However this is a common algorithm for all of Squid > authentication types. Updating it should be done as a separate action > and cover more than just this auth scheme. In particular the core cache > code is shared by Basic and Digest. It may be acceptable to reuse the old caching code that uses periodic cleanup. However, adding brand new caching classes with periodic cleanup is a rather different matter. An "it is a common wrong approach" excuse cannot be successfully applied to a completely new cache class IMO. In the proposed patch, the caching with periodic cleanup is done by TokenCache, which is a new type with various management code surrounding it (essentially a new class for the purpose if this discussion). >>> +// only clear tokens out of cache after 1 hour stale >>> +// could make this configurable >>> +time_t stalenessLimit = current_time.tv_sec - 60*60; >> >> Cache size should be(**) regulated in terms of memory usage (bytes), not >> entry age (seconds). I believe estimating memory footprint of a cache >> entry would be fairly straightforward in this case. Would you be willing >> to convert the code to use size limits? > > I agree with the idea. If/when we can move to LruMap or equivalent that > will happen. For now this is consistent with the other auth schemes, and > sufficient as a temporary state. Why cannot this code use the existing LruMap class from the start? Yes, LruMap requires some adjustments to handle more class users, but what is worse: Making those adjustments or committing code that introduces a new wrong class? The consistency reason hardly applies to a new cache class. Yes, some existing caches use time-based capacity, but some do not. Why pick the wrong model for the new stuff? > The memory calculation is also not as simple as it seems. I did not say it is simple. I said it is "fairly straightforward", which is at least a notch above simple :-). I think it reflects the complexity of those changes well. > The SBuf > content consists of the majority of memory usage and is dynamic at > run-time. What is actually needed to account the memory size of a > TokenCache node is: > > sizeof(TokenPointer) > + > sizeof(Token) > + > (*X)->b64encoded.store_->capacity Sounds straightforward to me. > Note that: > 1) store_ is correctly private and SBuf::length() is not sufficient to > account for most of the allocated MemBlob capacity being consumed. I do not understand why exposing the underlying storage capacity information for this purpose is a big deal. > 2) the store_->capacity may be shared by Token, so freeing just this > entry may not reduce memory usage by amount it "uses". Accounting for it > may have caused the cache size to be over-estimated. Sure, but we do not need a precise number. We only need a good estimate. And if experience shows that using shared MemBlob capacity is not good enough, we can always "cow" the tokens stored to make sure they own their buffer exclusively. Please note that the cache is responsible only for what it uses. It is not responsible for the memory used by others, even if it is the same memory. If a MemBlob is shared by a cache entry and something else in Squid, removing that cache entry is sufficient as far as cache obligations and accounting are concerned. > 3) the whole calculation is hard-coded at compile time in LruMap. So we > cannot call the SBuf size methods anyway. I do not understand why making the LRU entry size variable for this purpose is a big deal. > I notice that the current use of LruMap for storing > CertValidationResponse objects is also suffering from this problem. It > fails to account for the data size used by a whole vector of error > strings, and all of a X509 certificate memory allocation (we know those > are huge with lots of linked details). That bug is one of the motivations to do the right thing here. I do not see why we should repeat a similar mistake twice. Let's learn from our mistakes. And making LruMap more flexible would actually help with fixing the certificate storage bug as well (although the problem with that bug is far more difficult than adding variable entry size to LruMap). >>> +/* >>> + * For big caches we could consider stepping through >>> + * 100/200 entries at a time. Lets see how it flies >>> + * first. >>> + */ >>> +Auth::Bearer::TokenC
Re: [PATCH] OAuth 2.0 Bearer authentication
On 5/08/2014 3:22 a.m., Alex Rousskov wrote: > On 07/31/2014 03:29 AM, Amos Jeffries wrote: > >> A garbage collection TTL "cleanup_interval" is configurable and removes >> cache entries which have been stale for at least 1 hr. > > While some old code still uses periodic cleanup, I think we should avoid > adding more code like that. Periodic cleanup leads to cache overflows, > memory usage surprises, and hot idle. Please remove old unneeded cache > entries when adding a new cache entry instead of checking the cache > state every hour. I agree. However this is a common algorithm for all of Squid authentication types. Updating it should be done as a separate action and cover more than just this auth scheme. In particular the core cache code is shared by Basic and Digest. >> +// only clear tokens out of cache after 1 hour stale >> +// could make this configurable >> +time_t stalenessLimit = current_time.tv_sec - 60*60; > > Cache size should be(**) regulated in terms of memory usage (bytes), not > entry age (seconds). I believe estimating memory footprint of a cache > entry would be fairly straightforward in this case. Would you be willing > to convert the code to use size limits? I agree with the idea. If/when we can move to LruMap or equivalent that will happen. For now this is consistent with the other auth schemes, and sufficient as a temporary state. The memory calculation is also not as simple as it seems. The SBuf content consists of the majority of memory usage and is dynamic at run-time. What is actually needed to account the memory size of a TokenCache node is: sizeof(TokenPointer) + sizeof(Token) + (*X)->b64encoded.store_->capacity Note that: 1) store_ is correctly private and SBuf::length() is not sufficient to account for most of the allocated MemBlob capacity being consumed. The Token may be holding 64KB I/O buffer MemBlob for access to 10 bytes of token string. 2) the store_->capacity may be shared by Token, so freeing just this entry may not reduce memory usage by amount it "uses". Accounting for it may have caused the cache size to be over-estimated. 3) the whole calculation is hard-coded at compile time in LruMap. So we cannot call the SBuf size methods anyway. I notice that the current use of LruMap for storing CertValidationResponse objects is also suffering from this problem. It fails to account for the data size used by a whole vector of error strings, and all of a X509 certificate memory allocation (we know those are huge with lots of linked details). Instead it assumes the cert is sizeof(X509*) and the string content is sizeof(std::vector) which is a KB or more away from reality per-entry. >> +/* >> + * For big caches we could consider stepping through >> + * 100/200 entries at a time. Lets see how it flies >> + * first. >> + */ >> +Auth::Bearer::TokenCache *cache = &Auth::Bearer::Token::Cache; >> +for (Auth::Bearer::TokenCache::iterator itr = cache->begin(); itr != >> cache->end();) { > > I do not think we should do linear search like that. It is going to be > too slow for large caches and adding arbitrary "maximum number of > iterations" limits just adds more tuning headaches. Why not use an LRU > cache instead? IIRC, we have even added LruMap class for that recently. > No less efficient than the linear walk the LruMap does. A full scan is required for the time-based removal since entries are mapped by their key value not by expiry time. LruMap also uses raw pointers for storing its entries. These Token objects are ref-counted and if we store as raw pointers they will either leak or be freed while still linked by the cache. The only way around that seems to be storing Pointers to the list and making every access a double de-reference. Or re-implementing the entire LruMap template using Pointer and a dynamic size() calculation. The existing cache model used by both Basic and Digest already is sufficient for the needs of this feature. Note that the periodic cleanup is an optimization for large caches, to reduce the impact of this linear search for stale objects instead of > >> +For Basic and Bearer the default is "Squid proxy-caching web server". > > Please add a comma after Bearer. Done. Amos