Couple of questions here: 1) Do we know why ET_CLUSTER is so much more CPU intensive than the ET_NET ? Shouldn't they (roughly) be the same ?
2) Is the 300Mbps limit on the cluster connection due to the CPU consumption? Does it matter / vary on the type of traffic it proxies ? Are larger objects easier / faster / more throughput ? 3) Can you explain more about what the "message driven" architecture entails? Also, I think it's worthwhile to add the cluster redesign to its own Project page now, since you seem to have both concrete design and implementation details. Thanks! -- Leif On Sep 25, 2013, at 11:21 PM, Zhao Yongming (Confluence) <conflue...@apache.org> wrote: > 4.x Project plan > Page edited by Zhao Yongming > Comment: update cluster plan > > Changes (3) > ... > h3. Cache clustering improvements > > <Alibaba provide more details> > [https://cwiki.apache.org/confluence/display/TS/Clustering] for more cluser > details, we are now working on the 'refine cluster' project, which main > target will be performance improvement: > > [refine cluster TS-2005|https://issues.apache.org/jira/browse/TS-1571] > * what problem we want to solve? > ** CPU usage > on our 16 cores system, we find out that ET_CLUSTER cluster threads will use > more CPU than ET_NET, we have to increase more cluster threads to avoid the > CPU limit of ET_CLUSTER, up to 8-12 threads. > ** throughput per cluster connection > echo cluster connection can provide 300mbps traffic, that is hard to stand > due to the high traffic requirement > ** big cluster issue > we need to handle about 200 hosts per cluster > * what we do: > ** make cluster a pure message driven layer, no more vc splice on each side > ** cleanup the msg encapsulation and callback implements > ** modified the cache cluster interface > * what we don't change: > ** the cluster API in most of the high level > ** the cache interface in most of the high view > > h3. Cache key redesign > > ... > Full Content > This is the current brainstorming ideas for a v4.x plan. This will change! > > Accepted plans > Ideas > HostDB > Fundamental rewrite of the whole damn thing. > > Remove persistent storage > Make it possible to write appropriate Load Balancer plugins > Plugin interfaces > Implements IP address generators > E.g. parallel ipv4/ipv6 lookups > Layer split DNS > Better retries logic > Can afford allocations on recursive resolving (going to DNS), but not serving > out of internal cache. > /etc/hosts > All generators are forward iterators > The generator may require a continuation if iterators would block > Cache in the core, on top of generators > Health-checks: Do they belong in the core / APIs here, or entirely in > plugins? It's probably useful to have at least basic functionality available > in core, but extensible through APIs.. > Configurations > Requirements > Comments > Data types > Scalars > numerical > integer > floats > strings > hashes / maps > lists > enum > Nesting > Includes > amc's ideas > Cleanup code - cleanup configs > TSconfig or YAML as basic config format (universally, for all configs) > Extension to allow e.g. plugins / Lua to modify the underlying config storage > E.g. Pass the current config unit to a Lua script, which can modify it (at > startup) and return it to the core > bcall: need to support include directories ("include ats.d/*") > One format, allow to slice it up as you like > Big issue: What config format should we use? > YAML > TSconfig > JSON [No comments] > ming_zym's proposal: example at http://people.apache.org/~zym/remap_config.txt > Cache visibility > Modification notifications via hooks / stats > E.g. know when an object is about to be evicted > Include header information (if requested) > Plugin APIs > HTTP cache APIs > Need to update cache ops TSApi to work with the actual HTTP cache. > > Cache plugin APIs > Reintroduce this, but with reasonable performance. > > Cache clustering improvements > https://cwiki.apache.org/confluence/display/TS/Clustering for more cluser > details, we are now working on the 'refine cluster' project, which main > target will be performance improvement: > > refine cluster TS-2005 > > what problem we want to solve? > CPU usage > on our 16 cores system, we find out that ET_CLUSTER cluster threads will use > more CPU than ET_NET, we have to increase more cluster threads to avoid the > CPU limit of ET_CLUSTER, up to 8-12 threads. > throughput per cluster connection > echo cluster connection can provide 300mbps traffic, that is hard to stand > due to the high traffic requirement > big cluster issue > we need to handle about 200 hosts per cluster > what we do: > make cluster a pure message driven layer, no more vc splice on each side > cleanup the msg encapsulation and callback implements > modified the cache cluster interface > what we don't change: > the cluster API in most of the high level > the cache interface in most of the high view > Cache key redesign > See: https://cwiki.apache.org/confluence/display/TS/Cleanup+of+Cache+URLs > > API for generating e.g. X-Cache-Detail (or Via) header > See https://issues.apache.org/jira/browse/TS-1571 . > > It should be possible to generate any type of these headers, and ideally, > replace our hardcoded Via: headers with a plugin producing the same. > > Cache partial objects > Basically, allow caching of URLs that only gets Range: requests. This is > discussed on https://issues.apache.org/jira/browse/TS-974 . > > Graceful reload of plugins, or graceful restarts > This is discussed on https://issues.apache.org/jira/browse/TS-1969 . Being > able to transfer cache (mmap'ed areas) from old process to new would be neat. > > Server Intercepts for e.g. FastCGI / AJP > This would allow to e.g. run PHP inside ATS, or serve static file out of the > file system (easily). There are a few issues here, such as performance on > server intercepts are not great. This would be one, or more, new plugins and > possibly API additions and improvements (to get performance to behave at > least). > > Tiered Storage > This is partially implemented with the Alibaba cache > (https://issues.apache.org/jira/browse/TS-745), and with the Comcast volume > tagging (https://issues.apache.org/jira/browse/TS-1728) . A new ticket should > be created for this task, and we also want to unify configuration here (right > now, TS-745 configures via records.config, and not storage.config). > > It's also a missing feature now that we can not for example run with RAM > cache only, since the RAM cache is not an independent "cache tier". Such > decoupling would be necessary to work with cache plugin APIs as well (right > now, the two RAM caches are tightly integrated with the disk cache). > > SSL session tickets > ATS needs a shared secret to validate the client submitted session tickets. > > addtional reading: > http://tools.ietf.org/html/rfc5077 > https://cwiki.apache.org/confluence/display/TS/SSLSessionResumption > can not reuse SSL connections on RHEL5/CentOS5 > RFC 5077 TLS Session tickets > > RFC 5861 > Stale while error and/or revalidate. This requires core changes to the cache > / dirs to support these features properly. Discussion started up on whether > this belongs in the core or in a plugin. > > Documentation > Migration to Sphinx documentation, to contain all documentation, and produce > all formats (online/HTML, PDF, man-pages etc.). Alan working on some > extensions to make it easy to link to source files etc. > > Goals for v3.6: > > Complete conversion to Sphinx, including automatic website generation etc. > Add missing documentations (plugins, configurations, etc.) > Verify completeness with old PDF documentation. > Internationalization support. > Transifex.com > Support documentation (man-pages / HTML) generation as part of make asf-dist. > SPDY and HTTP/2 support > At a minimum, as a front-end (client side) plugin, but we should examine how > to be a full SPDY and/or HTTP/2 proxy. How does WebSockets fit into this? > Should we provide APIs to make it easier / possible for a plugin to do the > upgrade dance, as well as building application specific WebSocket proxying. > > Modularization of the HTTP SM > Needs > > Documentation of current flow and states > Cleanup > Refactor / modularize as necessary. > C++ API > FInalize the APIs, migrate plugins as appropriate, provide examples and > documentation. > > Authentication > We have the auth-proxy plugin. Bryan Call to investigation see what Yahoo has > in the space of OAUTH(2) plugins. > > Stop watching page | Change email notification preferencesView Online | View > Changes | Add Comment