the trick at least in squid-2 is to make sure that quick abort isn't occuring. Or it will begin downloading the whole object, return the requested range bit, and then abort the remainder of the fetch.
Adrian 2009/11/25 Amos Jeffries <squ...@treenet.co.nz>: > Matthew Morgan wrote: >> >> On Wed, Nov 25, 2009 at 7:09 PM, Amos Jeffries <squ...@treenet.co.nz> >> wrote: >>> >>> Matthew Morgan wrote: >>>> >>>> Sorry it's taking me so long to get this done, but I do have a question. >>>> >>>> You suggested making getRangeOffsetLimit a member of HttpReply. There >>>> are >>>> two places where this method currently needs to be called: one is >>>> CheckQuickAbort2() in store_client.cc. This one will be easy, as I can >>>> just >>>> do entry->getReply()->getRangeOffsetLimit(). >>>> >>>> The other is HttpStateData::decideIfWeDoRanges in http.cc. Here, all we >>>> have access to is an HttpRequest object. I looked through the source to >>>> see >>>> if I could find where a request owned or had access to a reply, but I >>>> don't >>>> see anything like that. If getRangeOffsetLimit were a member of >>>> HttpReply, >>>> what do you suggest doing here? I could make a static version of the >>>> method, but that wouldn't allow caching the result. >>> >>> Ah. I see. Quite right. >>> >>> After a bit more though I find my original request a bit weird. >>> >>> Yes it should be a _Request_ member and do its caching there. You can go >>> ahead with that now while we discuss whether to do a slight tweak on top >>> of >>> the basic feature. >>> >>> >>> [cc'ing squid-dev so others can provide input] >>> >>> I'm not certain of the behavior we want here if we do open the ACLs to >>> reply >>> details. Some discussion is in order. >>> >>> Simple way would be to not cache the lookup the first time when reply >>> details are not provided. >>> >>> It would mean making it return potentially two different values across >>> the >>> transaction. >>> >>> 1) based on only request detail to >>> and other on request+reply details. decide if a range request to >>> possible. >>> and then >>> 2) based on additional reply details to see if the abort could be done. >>> >>> No problem if the reply details cause an increase in the limit. But if >>> they >>> restrict it we enter grounds of potentially making a request then >>> canceling >>> it and being unable to store the results. >>> >>> >>> Or, taking the maximum of the two across two calls? so it can only >>> increase. >>> would be slightly trickier involving a flag a well to short-circuit the >>> reply lookups instead of just a magic cache value. >>> >>> Am I seriously over-thinking things today? >>> >>> >>> Amos >> >> Here's a question, too: is this feature going to benefit anyone? I >> realized later that it will not solve my problem, because all the >> traffic that was getting force downloaded ended up being from windows >> updates. The urls showing up in netstat and such were just weird >> because the windows update traffic was actually coming from limelight. >> My ultimate solution was to write a script that reads access.log, >> checks for windows update urls that are not cached, and manually >> download them one at a time after hours. >> >> If there is anyone at all who would benefit from this I would still be >> *more* than glad to code it (as I said, it would be my first real open >> source contribution...very exciting), but I just wondered if anyone >> will actually use it. > > I believe people will find more control here useful. > > Windows update service packs are a big reason, but there are also similar > range issues with Adobe Reader online PDFs, google maps/earth, and flash > videos when paused/resumed. Potentially other stuff, but I have not heard of > problems. > > This will allow anyone to fine tune the places where ranges are permitted or > forced to fully cache. Avoiding the problems a blanket limit adds. > >> >> As to which approach would be better, I don't know enough about that >> data path to really suggest. When I initially made my changes, I just >> replaced each reference to Config.range_offset_limit or whatever. >> Today I went back and read some more of the code, but I'm still >> figuring it out. How often would the limit change based on the >> request vs. the reply? > > Just the once. On first time being checked for the reply. > And most likely on the case of testing for a reply mime type. The other > useful info I can think of are all request data. > > You can ignore if you like. I'm just worrying over a borderline case. > Someone else can code a fix if they find it a problem or need to do mime > checks. > > Amos > -- > Please be using > Current Stable Squid 2.7.STABLE7 or 3.0.STABLE20 > Current Beta Squid 3.1.0.15 > >