On 11/25/2009 08:34 PM, Bryan Call wrote:
I am talking about internal files created by traffic server, cache and hostdb
being a couple of them. For someone with a large setup they would roll out
software to a few servers at a time. Most caches have a Zipfian distribution
and would have a working set that would populate the cache to a stable cache
hit ratio is a few hours, this has been my experience with very large databases.
I think I disagree with the cache not being "important" to preserve,
most people probably won't run as massive deployments as Y!. But, this
is obviously speculation only, the only "metrics" we have from use cases
comes from internal use.
I
Yes, you just proved my point. There are few people (not many) that have large
caches. You can add Search Crawler and Flickr (that doesn't use TS) to the
list. This is a minority of the users and a minority of the traffic. Few
groups are asking for larger caches.
Well, my point isn't that > 512GB caches is the normal case, it was that
there are known use cases internally at Yahoo. I can't speculate on how
people outside Y! would use TS, I don't think anyone knows.
The list that you have above are reasons groups will be less likely to
move to the Apache branch. People will have to modify their plugins
and wipe their caches. According to you this is "very distruptive".
Yes, it'll be disruptive, once, and for Y! engineering only. Something
we're willing to deal with (once) as part of this OpenSource effort. I
think it'd be much, much more disruptive breaking compatibilities once
we've made an official ASF release. I'm assuming (and hoping) that we'll
have hundreds if not thousands of customers relying on Traffic Server
once we make an official release. Affecting them in a "disruptive" way
when upgrading (2.2, 2.4 etc.) seems much more harmful than letting Y!
taking the one time hit.
You haven't addressed anything about stability and how we are going to test all
the changes. There have been a lot of changes to the Apache tree that haven't
been fully tested. Also, there have been a lot of changes that have happened
and are happening to the internal branches that haven't made it to the Apache
branch yet.
My assumption has been that there are enough serious issues with those
items that I've pointed out, that we need to fix those. If that is not
the case, then sure, lets freeze the APIs / ABIs / cache layouts now,
and focus on stability. But, then we shouldn't break this until a "3.0"
release, IMO at least.
This assumption of mine is based on two things that I am experienced
with; 1) The cache plugin APIs have major performance problems, and most
likely needs to get a major overhaul. Seeing it's incremental
development during 1.17 (Y! internal), it was breaking ABIs often,
causing major problems with deployments (it's been a mess). 2) The Remap
APIs are majorly horky IMO. At a minimum, we need to fix the things that
are missing and broken (I'm fairly certain that the chaining of remap
plugins is completely wrong for example).
John's comments and bug reports regarding the disk cache makes me
believe that it's a worthwhile change to make now, and doing so will
avoid breaking after the initial release.
All of these are up for discussion, I obviously shouldn't go out and
call them "personal requirements", that was bad wording on my part. I
should have said "personal wish list".
This reminds me of when we tried to (or are we still trying to) stabilize the
1.17 branch and there was a lot of push to add in features (SRV,
string_get_ref, redirect, etc). These features created instability and were
not properly tested. I don't think anyone wants to go down that road again...
Yes, that is a valid point. I think the main problem with the 1.17
release was that we kept adding more and more features to it, and never
letting it stabilize between such additions. So, lets decide exactly
what should and should not go into the first ASF release (which I
believe we'll call "2.0"). If fixing any of the things on my personal
preference list doesn't get addressed, so be it. I still believe that
stabilizing APIs / ABIs now would make it a more stable platform for
people to build on, but it's entirely possible I'm completely wrong.
Lets start a Confluence Wiki page with the proposals for what should be
candidates for going into a "2.0" release, and then vote on each one of
them. I think that's the HTTPD way, right Paul? I've created this page,
please help out and update it with details and other ideas:
http://cwiki.apache.org/confluence/display/TS/Release-2.0
I'm not adding sections on obvious things here, like proper startup
scripts, traffic_manager actually working etc. Things that are outright
broken, which used to work, we clearly have to fix to call it "stable" :).
Cheers,
-- leif