Re: concurrency attribute and questions.
louis gonzales wrote: List, 1) for the concurrency attribute does this simply indicate how many items in a batch will be sent to the external helper? Sort of. The strict definition of 'batch' does not apply. Better to think of it as a window max-size. So from 0 to N-concurrency items will be passed straight to the helper before squid starts waiting for their replies to free-up the slots. 1.1) assuming concurrency is set to 6 for example, and let's assume a user's browser session sends out 7 actual URL's through the proxy request - does this mean 6 will go to the first instance of the external helper, and the 7th will go to a second instance of the helper? 1-6 will go straight through probably with the IDs 1-6. #7 may or may not go straight through, depending if one of the first 6 was finished at that time. 1.1.1) Assuming the 6 from the first part of the batch return OK and the 7th returns ERR, will the user's browser session, render the 6 and not render the 7th? Depends entirely on how the ERR/OK results are used in squid.conf. (you might be denying on OK or allowing on ERR). More importantly, how does Squid know that the two batches - one of 6, and one with 1, for the 7 total, know that all 7 came from the same browser session? There is no such thing as a browser session to Squid. Each is a separate object, these 7 happen MAY be coming from the same IP, but may be different software for all squid cares, or may come from more than one IP completely. What I have currently: - openldap with postgresql, used for my user database, which permits me to use the auth_param squid_ldap_auth module to authenticate my users with. - a postgresql database storing my acl's for the given user database Process: Step1: user authenticates through squid_ldap_auth Step2: the user requested URL(and obviously all images, content, ...) get passed to the external helper Step3: external helper checks those URL's against the database for the specific user and then determines OK or ERR Issue1: How to have the user requested URL(and all images, content, ...) get passed as a batch/bundle, to a single external helper instance, so I can collectively determine OK or ERR Any ideas? Is the concurrency attribute to declare a maximum number of requests that go to a single external helper instance? number of *parallel* requests the helper can process. Most helpers shipped with Squid are non-parallel (concurrency=1). So if I set concurrency to 15, should I have the external helper read count++ while STDIN lines come in, until no more, then I know I have X number in a batch/bundle? Depends on the language your helper is coded in. As long as it can process 15 lines of input in parallel without mixing anything up. Looks like a perl helper, they can do parallel just fine with no special reads needed. But it must handle the extra ID token at the start of the line properly. Obviously there is no way to predetermine how many URL's/URI's will need to be checked against the database, so if I set concurrency to 1024, presuming to be high enough that no single request will max it out, then I can just count++ and when the external helper is done counting STDIN readlines, I can process to determine OK or ERR for that specific request? additional point to this: the ttl=N option will cache the OK/ERR result for that lookup for N seconds. This can greatly reduces the number of tests passed back even further. Issue2: I'd like to just have a single external helper instance start up, that can fork() and deal with each URL/URI request, however, I'm not sure Squid in its current incarnation passes enough information OR doesn't permit specific enough passback (from the helper) information, to make this happen. Squid passes an ID for each line of input. As long as the result goes back out stdout of the helper Squid itself forked with that ID at the front Squid does not care the order of responses. You will need to make sure your parallel child stdout/stderr write to your parent helpers stdout/stderr. But it should be possible. Amos -- Please be using Current Stable Squid 2.7.STABLE6 or 3.0.STABLE13 Current Beta Squid 3.1.0.6
Re: concurrency attribute and questions.
Amos, Thanks for your responses, they make things clearer for me, so now I can ask better questions :) What I'd like to do is have my PERL helper fork as necessary, rather than starting up (children=50) or (children=100), or N external_acl_type instances which is not efficient and based off of an indeterminate number of users, 50 or 100 may not be enough, or at times too many. What settings in the squid.conf line tell Squid the external helper will fork to handle subsequent objects? Below is my current line: external_acl_type eXhelperI children=1 %LOGIN %METHOD %{Host} /usr/lib/squid/eXhelper.pl since I set children=1 only one eXhelper.pl starts up with Squid, with the idea in mind that eXhelper forks children processes as necessary. I'm still trying to determine what state information Squid passes to the external helper besides the %LOGIN/%METHOD... [ below you mentioned an ID token, are you referring to the %LOGIN ID token? Or something else? ]. I understand that Squid forks the eXhelper.pl, which means Squid owns the ppid(parent process ID) of the eXhelper.pl - ideally I'd like to have this single child then, fork subprocesses too, currently I'm uncertain what input trigger(or signal) if any exists, to have the single external helper fork the subprocess to check the object and how to uniquely ensure the OK or ERR goes back to the calling ID token Thank you again - I did write some comments below. On Sun, Apr 5, 2009 at 6:05 AM, Amos Jeffries squ...@treenet.co.nz wrote: louis gonzales wrote: List, 1) for the concurrency attribute does this simply indicate how many items in a batch will be sent to the external helper? Sort of. The strict definition of 'batch' does not apply. Better to think of it as a window max-size. Louis: So should I have the PERL helper buffer the data passed to it, rather than reading line by line - if buffer what are the start and end identifiers? So from 0 to N-concurrency items will be passed straight to the helper before squid starts waiting for their replies to free-up the slots. Louis: If 0-(jth) objects belong to a specific user's request and (jth)-Nth belong to a different user request, assuming concurrency is set to N, how does one differentiate in the external helper, which set belongs to who (I'm using the %LOGIN parameter so I know which userID, as authenticated by ldap, is making the request) - in other words, after I've determined OK for the 0-(jth) and ERR for the (jth)-Nth, the specific instance of the helper will need to return two different values. Basically my helper checks each one of the squid passed Objects(URL/%LOGIN) pairs against the ACL's in the postgresql database. My use case, guarantees the only end user application will be a web browser, so with that assumption, when the end user opens www.foxnews.com, for instance, there are a multitude of objects, so my specific question is: when squid goes to retrieve all of these objects for the requesting user, does Squid - a) with concurrency set high enough, send all of these objects to the same external helper instance and await a single OR or ERR? and b) with concurrency off, does Squid one-to-one object-to-external_helper_instance awaiting for OK or ERR? 1.1) assuming concurrency is set to 6 for example, and let's assume a user's browser session sends out 7 actual URL's through the proxy request - does this mean 6 will go to the first instance of the external helper, and the 7th will go to a second instance of the helper? 1-6 will go straight through probably with the IDs 1-6. #7 may or may not go straight through, depending if one of the first 6 was finished at that time. Louis: is it ever possible with concurrency enabled, that objects from two different users will enter into a single external helper instance? 1.1.1) Assuming the 6 from the first part of the batch return OK and the 7th returns ERR, will the user's browser session, render the 6 and not render the 7th? Depends entirely on how the ERR/OK results are used in squid.conf. (you might be denying on OK or allowing on ERR). More importantly, how does Squid know that the two batches - one of 6, and one with 1, for the 7 total, know that all 7 came from the same browser session? There is no such thing as a browser session to Squid. Each is a separate object, these 7 happen MAY be coming from the same IP, but may be different software for all squid cares, or may come from more than one IP completely. Louis: right, but Squid obviously has to know which IP the request came from, in order to serve the page(s), so when the external helper processes the OK or ERR, certainly those will trace back the path from which they came to the correct requesting application(browser or other). What I have currently: - openldap with postgresql, used for my user database, which permits me to use the auth_param squid_ldap_auth module to authenticate my users with. - a postgresql database storing my acl's for
concurrency attribute and questions.
List, 1) for the concurrency attribute does this simply indicate how many items in a batch will be sent to the external helper? 1.1) assuming concurrency is set to 6 for example, and let's assume a user's browser session sends out 7 actual URL's through the proxy request - does this mean 6 will go to the first instance of the external helper, and the 7th will go to a second instance of the helper? 1.1.1) Assuming the 6 from the first part of the batch return OK and the 7th returns ERR, will the user's browser session, render the 6 and not render the 7th? More importantly, how does Squid know that the two batches - one of 6, and one with 1, for the 7 total, know that all 7 came from the same browser session? What I have currently: - openldap with postgresql, used for my user database, which permits me to use the auth_param squid_ldap_auth module to authenticate my users with. - a postgresql database storing my acl's for the given user database Process: Step1: user authenticates through squid_ldap_auth Step2: the user requested URL(and obviously all images, content, ...) get passed to the external helper Step3: external helper checks those URL's against the database for the specific user and then determines OK or ERR Issue1: How to have the user requested URL(and all images, content, ...) get passed as a batch/bundle, to a single external helper instance, so I can collectively determine OK or ERR Any ideas? Is the concurrency attribute to declare a maximum number of requests that go to a single external helper instance? So if I set concurrency to 15, should I have the external helper read count++ while STDIN lines come in, until no more, then I know I have X number in a batch/bundle? Obviously there is no way to predetermine how many URL's/URI's will need to be checked against the database, so if I set concurrency to 1024, presuming to be high enough that no single request will max it out, then I can just count++ and when the external helper is done counting STDIN readlines, I can process to determine OK or ERR for that specific request? Issue2: I'd like to just have a single external helper instance start up, that can fork() and deal with each URL/URI request, however, I'm not sure Squid in its current incarnation passes enough information OR doesn't permit specific enough passback (from the helper) information, to make this happen. Any deeper insights, would be tremendously appreciated. Thanks, -- Louis Gonzales BSCS EMU 2003 HP Certified Professional louis.gonza...@linuxlouis.net