Re: /bzr/squid3/trunk/ r12194: Small optimization in CommOpener statistic accounting.

2012-07-03 Thread Henrik Nordström
tis 2012-07-03 klockan 07:52 +0200 skrev Kinkie:

++(conn_-getPeer()-stats.conn_open);
 
 
  IMHO we should consistently use bracketing as above to clarify in situations
  like this where there is any complex location syntax.
 
 It is the latter, but I had the some doubt so I double-checked. I
 agree however that this may be not as readable as I'd like.
 Ok for the bracketing, I'll review the cases that snuck in and correct them.


Usually the other ++ works better in readability

conn_-getPeer()-stats.conn_open++;

or even

conn_-getPeer()-stats.conn_open += 1;

Regards
Henrik



Re: [RFC] Certificate validation helper

2012-07-03 Thread Kinkie
 Do you want me to provide basic reusable classes for helper request
 formatting and response parsing, if I have a chance?

YES, please!

-- 
/kinkie


Re: /bzr/squid3/trunk/ r12194: Small optimization in CommOpener statistic accounting.

2012-07-03 Thread Kinkie
 Usually the other ++ works better in readability

 conn_-getPeer()-stats.conn_open++;

 or even

 conn_-getPeer()-stats.conn_open += 1;

With GCC this code:
15  int a=0;
16  a++;
17  ++a;
18  a+=1;

gets assembled as:
16  a++;
= 0x0804873c main(int, char**)+168:   83 44 24 18 01 addl   $0x1,0x18(%esp)
(gdb)
17  ++a;
= 0x08048741 main(int, char**)+173:   83 44 24 18 01 addl   $0x1,0x18(%esp)
(gdb)
18  a+=1;
= 0x08048746 main(int, char**)+178:   83 44 24 18 01 addl   $0x1,0x18(%esp)



-- 
/kinkie


Re: /bzr/squid3/trunk/ r12194: Small optimization in CommOpener statistic accounting.

2012-07-03 Thread Alex Rousskov
On 07/02/2012 04:34 PM, Amos Jeffries wrote:
 On 03.07.2012 03:30, Francesco Chemolli wrote:

 -if (conn_-getPeer())
 -conn_-getPeer()-stats.conn_open++;
 +if (peer *peer=(conn_-getPeer()))
 +++peer-stats.conn_open;

  lookupLocalAddress();
 
 
 Two points:
 
 1) assignment in the if() needs to be double-bracketed around the whole
 = operator expression, not just the RHS:
 if ((peer *peer=conn_-getPeer()))


No, please do not do that! This is not a regular assignment, it is a
variable declaration with initialization. It does not need more
brackets. In fact, I would not be surprised if some compilers will fail
if those extra brackets are included.

The problem with that expression is that the variable name shadows the
type (it is usually not a bug due to C++ name resolution rules but bad
style).

It also uses too many brackets on the RHS. I know that some like extra
brackets (and then skimp on spaces, resulting in a really hard-to-read
stream of characters!) so I am not going to fight about this aspect.

The best syntax in this particular context is

if (peer *p = conn_-getPeer())

HTH,

Alex.


Re: Couple pointers

2012-07-03 Thread Alex Rousskov
On 07/02/2012 05:42 PM, Amos Jeffries wrote:

  ... is anyone able to come up with a script or tester to detect classes
 which violate our Big-3 constructor/destructor/assignment operator
 guideline? I know there are portions of the code which are broken today
 and it is easily overlooked, this would be a nice one to enforce
 automatically.

Hi Amos,

Yes, Measurement Factory has purchased a Coverity static analysis
license to automatically detect bugs like the ones you describe. I have
run one Squid test during evaluation, and the tool did detect missing
constructor/destructor/assignment operators (among hundreds of other
bugs and non-bugs).

We are waiting for Factory sysadmin to set the tool up and for Squid
Project sysadmin to setup [automated] Squid tests the results of which
can then be somehow shared with Squid developers (we did get Coverity
permission to do that). I hope this will be done in July.


HTH,

Alex.


Re: /bzr/squid3/trunk/ r12194: Small optimization in CommOpener statistic accounting.

2012-07-03 Thread Amos Jeffries

On 03.07.2012 21:27, Kinkie wrote:

Usually the other ++ works better in readability

conn_-getPeer()-stats.conn_open++;

or even

conn_-getPeer()-stats.conn_open += 1;


With GCC this code:
15  int a=0;
16  a++;
17  ++a;
18  a+=1;

gets assembled as:
16  a++;
= 0x0804873c main(int, char**)+168:	 83 44 24 18 01	addl   
$0x1,0x18(%esp)

(gdb)
17  ++a;
= 0x08048741 main(int, char**)+173:	 83 44 24 18 01	addl   
$0x1,0x18(%esp)

(gdb)
18  a+=1;
= 0x08048746 main(int, char**)+178:	 83 44 24 18 01	addl   
$0x1,0x18(%esp)



That matches what the recent blog post benchmarks are saying.

Con:
 * Modern compilers are able usually to optimize all forms 
automatically to one instruction.
 * when they optimize to 2 instructions, those are usually run in 
parallel by the CPU optimizations anyway.
 * +=1 is sometimes (when?) benchmarked as faster than both by about 2x 
the difference between the ++ themselves.
 * C++11 compilers contain a lot of optimizations (like auto-merging 
these) which dwarf these manual benefits.


Pros: (apparently)
 * -O0 builds apparently have some difference, (but then does anyone 
care with -O0?)
 * its easier to code a pre-increment operator on custom classes than a 
post. (do we care?)
 * pre-increment on compound objects does not undergo auto-magical 
conversion to operator+=() by default. Apparently post-increment does.
 * there is apparently still a difference on some less common CPU 
architectures. PowerPC was brought up as an example. No mention of ARM, 
so I'd be interested to know if ARM has any difference.




IMO,

 * This last two pros appear to make it still worthwhile for portable 
software like Squid. For now.


 * this is the type of minor optimization that we should accept as it 
comes, but not waste time arguing over (or obfuscating the code to 
achieve!). It's too small an optimization to be worth the trouble.


 * my point about ++conn_-getPeer()-stats.conn_open is solely down 
to readability. Which is trivially resolved by ++(x), with brackets 
around the symbol being incremented. Using bracketing just for clarity 
covers all sorts of borderline expressions which appear obfuscated at 
first glance.


Amos



Re: [RFC] Certificate validation helper

2012-07-03 Thread Amos Jeffries

On 03.07.2012 14:59, Alex Rousskov wrote:

On 07/02/2012 06:20 PM, Amos Jeffries wrote:

On 03.07.2012 08:11, Alex Rousskov wrote:

Hello,

Awaken by DigiNotar CA compromise, various web agents now try 
harder
to validate SSL certificates (see 2011 squid-dev thread titled SSL 
Bump
Certificate Blacklist for a good intro). From user point of view, 
a

bumping Squid is the ultimate authority on server certificate
validation, so we need to go beyond basic OpenSSL checks as well.

Various protocols and other validation approaches are floating 
around:
CRLs, OCSP, SCVP, DNSSEC DANE, SSL Notaries, etc. There is no 
apparent

winner at the moment so we are in a stage of local experimentation
through trial-and-error. We have seriously considered implementing 
one
of the above mentioned approaches in Squid, but it looks like it 
would
be better to add support for a general validation helper instead, 
so

that admins can experiment with different approaches.

The helper will be optionally consulted after a successful internal
OpenSSL validation we do now. It will receive the origin server
certificate [chain] and the intended domain name. On errors, the 
helper
will return the validation error name (see %err_name error page 
macro
and %err_details logformat code), error reason (%ssl_lib_error 
macro),
and possibly the offending certificate (%ssl_subject and 
%ssl_ca_name
macros) -- similar to what the internal validation code returns 
now.


Squid may maintain a cache of certificate validation results to 
reduce
validation performance burden (indexed by certificate 
fingerprint?).


On this point. If you are using the helper API you will be wanting 
to
have a helper result cache. These are keyed on the input string sent 
to

the helper.



Squid will provide a dummy helper. Eventually, full-featured 
helpers may

be contributed (but I am currently not planning on writing one).


If there are no objections, I would like to detail the above on 
Squid
wiki and eventually submit a patch implementing this feature in 
v3.3. Do

you think that is a good idea?



I think it would be good to provide the extra flexibility.



I am in the process of modifying the helper API for consistency 
across

all helpers starting with 3.3. It would be great if you could design
your helper to use a generic output format for data sent back to 
Squid:


 [channel-ID] (OK/ERR/BH) [key-pairs] terminator


OK, but not all helper communication is line-based. We may need to 
send

PEM-encoded certificates back (and ssl_crtd already does that). That
requires sending multiline blocks of data.

If you want to generalize that, consider adding body start/end 
terminators.


I know. That is why I omit the word line and specify terminator 
instead of EOL.



 * With BH optionally as a broken-helper failure state as per the 
NTLM

helper API.
 * Noticing how there are *no* fixed-position fields beyond the 
result

code.

For data from Squid to helper please design with minimum number of
fixed-position/type fields and define [key-pair] as optional
extension(s) to hold whatever data is optionally sent to the helper.
Ideally we will never transmit any fields which could be 
-/[unknown]

to or from the helper.


Sounds good.

Do you want me to provide basic reusable classes for helper request
formatting and response parsing, if I have a chance?


I have one almost finished as a callback params object for replies. As 
soon as it passes all the build tests I'll be submitting a patch before 
starting to merge the parsers and extend the available response 
key-pairs.


If you want to make a generic one for requests that would help. It's a 
bit more complicated to get backward compatible due to the larger syntax 
variance between helper request formats.


Amos



Re: [RFC] Certificate validation helper

2012-07-03 Thread Alex Rousskov
On 07/03/2012 04:56 PM, Amos Jeffries wrote:
 On 03.07.2012 14:59, Alex Rousskov wrote:
 On 07/02/2012 06:20 PM, Amos Jeffries wrote:

 I am in the process of modifying the helper API for consistency across
 all helpers starting with 3.3. It would be great if you could design
 your helper to use a generic output format for data sent back to Squid:

  [channel-ID] (OK/ERR/BH) [key-pairs] terminator

 OK, but not all helper communication is line-based. We may need to send
 PEM-encoded certificates back (and ssl_crtd already does that). That
 requires sending multiline blocks of data.

 If you want to generalize that, consider adding body start/end
 terminators.
 
 I know. That is why I omit the word line and specify terminator
 instead of EOL.


The proposed format is missing the body itself, unless you want to force
all helpers to use key=value format for blobs such as PEM-encoded
certificates. Ideally, there will be a way for generic helper parsers to
detect and extract the body. To reach that ideal, there should be a
common format that includes the body.


 Do you want me to provide basic reusable classes for helper request
 formatting and response parsing, if I have a chance?
 
 I have one almost finished as a callback params object for replies. As
 soon as it passes all the build tests I'll be submitting a patch before
 starting to merge the parsers and extend the available response key-pairs.
 
 If you want to make a generic one for requests that would help. It's a
 bit more complicated to get backward compatible due to the larger syntax
 variance between helper request formats.

Understood.

Alex.



Re: [RFC] Certificate validation helper

2012-07-03 Thread Alex Rousskov
On 07/03/2012 06:54 PM, Amos Jeffries wrote:
 On 04.07.2012 12:11, Alex Rousskov wrote:
 On 07/03/2012 04:56 PM, Amos Jeffries wrote:
 On 03.07.2012 14:59, Alex Rousskov wrote:
 On 07/02/2012 06:20 PM, Amos Jeffries wrote:

 I am in the process of modifying the helper API for consistency across
 all helpers starting with 3.3. It would be great if you could design
 your helper to use a generic output format for data sent back to
 Squid:

  [channel-ID] (OK/ERR/BH) [key-pairs] terminator

 OK, but not all helper communication is line-based. We may need to send
 PEM-encoded certificates back (and ssl_crtd already does that). That
 requires sending multiline blocks of data.

 If you want to generalize that, consider adding body start/end
 terminators.

 I know. That is why I omit the word line and specify terminator
 instead of EOL.


 The proposed format is missing the body itself, unless you want to force
 all helpers to use key=value format for blobs such as PEM-encoded
 certificates.
 
 Oops. yes. The HelperReply object has to include a field blob of type
 char* specific to each helper (for certs and bodies blobs, messages,
 etc.) which includes everything between the first undentifiable key-pair
 and the terminator. It is required for backward compatibility even if I
 was set on key-pair always. So it may as well be formalised.

Why should the body include key-value pairs?

 [channel-ID] (OK/ERR/BH) [key-pairs] blob terminator

How will the generic code be able to tell where key-pairs end and blob
begins?


 Ideally, there will be a way for generic helper parsers to
 detect and extract the body. To reach that ideal, there should be a
 common format that includes the body.

 
 Yes.
 
 Pedant: There are only 2 helpers out of 14 that send certificate bodies.
 This new one and ssl_crtd. Why define a body field just for them?

It is certainly not needed if you do not want to have one parser/format
for all helpers. We can continue to craft custom code for each new
helper that needs to send bodies if you think that is the best approach.


 With [key-pair] before any helper-specific blob, we can add a key-pair
 cert=foo for generic certificate passing around if/when necessary.

This approach does not work well because foo may include spaces/'='
and, hence, be confused with more key-pairs.


If you want one format for all, you probably need something like


 [channel-ID] (OK/ERR/BH) [key-pairs] [BS body] terminator

where BS is some special marker that starts all bodies and cannot be
found in key-pairs. Since any body-carrying message is likely to have
new lines (and, hence, would need a non-newline terminator), this BS
sequence could be something like body:\n, I guess.


Thank you,

Alex.



Re: [RFC] Certificate validation helper

2012-07-03 Thread Amos Jeffries

On 04.07.2012 13:07, Alex Rousskov wrote:

On 07/03/2012 06:54 PM, Amos Jeffries wrote:

On 04.07.2012 12:11, Alex Rousskov wrote:

On 07/03/2012 04:56 PM, Amos Jeffries wrote:

On 03.07.2012 14:59, Alex Rousskov wrote:

On 07/02/2012 06:20 PM, Amos Jeffries wrote:


I am in the process of modifying the helper API for consistency 
across
all helpers starting with 3.3. It would be great if you could 
design

your helper to use a generic output format for data sent back to
Squid:

 [channel-ID] (OK/ERR/BH) [key-pairs] terminator


OK, but not all helper communication is line-based. We may need 
to send
PEM-encoded certificates back (and ssl_crtd already does that). 
That

requires sending multiline blocks of data.

If you want to generalize that, consider adding body start/end
terminators.


I know. That is why I omit the word line and specify 
terminator

instead of EOL.



The proposed format is missing the body itself, unless you want to 
force

all helpers to use key=value format for blobs such as PEM-encoded
certificates.


Oops. yes. The HelperReply object has to include a field blob of 
type

char* specific to each helper (for certs and bodies blobs, messages,
etc.) which includes everything between the first undentifiable 
key-pair
and the terminator. It is required for backward compatibility even 
if I

was set on key-pair always. So it may as well be formalised.


Why should the body include key-value pairs?


Huh? which body are you referring to now?





[channel-ID] (OK/ERR/BH) [key-pairs] blob terminator


How will the generic code be able to tell where key-pairs end and 
blob

begins?


The generic code will have a format for key-pair of ( 1#token = 
1#(token / quoted-string) )  [to be finalized still] with a set of known 
key names. If the bytes being parsed do not match that format and keys 
its not part of the key-pairs.


In that sense it is not completely generic, but more cross-helper 
shared key-pairs being accepted back without barfing. Some will of 
course be accepted but unused by certain helpers handler code (ie a 
url=FOO from authenticators).



NOTE:
 Existing code has a shared parser pulling off [channel-ID] and passing 
the remainder as char* to the helpers callback.


 What I'm doing in patch-#1 is making that shared parser pathway also 
handle the OK/ERR/... result field shared by almost all the helpers 
(redirectors exception) and converting the char* callback parameter to 
an object (result, blob). Some issues being resolved around char* vs 
String vs MemBuf (vs Sbuf?) for blob. Sort of settling on MemBuf for the 
ssl_crtd \0 handling requirements.


 The plan for patch-#2 is to discuss key-pair syntax (which we seem to 
have started), and move some particular key-pair parsing from helper 
handlers into the shared one. Opening up user=, tag=, password=, log=, 
message= keys to non-ACL helper responses will be VERY useful for RADIUS 
and systems needing to maintain cross-helper request state.






Ideally, there will be a way for generic helper parsers to
detect and extract the body. To reach that ideal, there should be a
common format that includes the body.



Yes.

Pedant: There are only 2 helpers out of 14 that send certificate 
bodies.

This new one and ssl_crtd. Why define a body field just for them?


It is certainly not needed if you do not want to have one 
parser/format

for all helpers. We can continue to craft custom code for each new
helper that needs to send bodies if you think that is the best 
approach.





Um, we seem to be getting off on different paths.
I *do* want a shared parser for as many details as possible ideally all 
details as key-pair. I'm asking why you want a special field locking one 
position in just for certificate body?


Defining a foo= key would possibly be useful. Even a custom value 
micro-format for that key-pair. The key-pairs are ALL optional when 
overall cross-helper needs are considered.





With [key-pair] before any helper-specific blob, we can add a 
key-pair
cert=foo for generic certificate passing around if/when 
necessary.


This approach does not work well because foo may include spaces/'='
and, hence, be confused with more key-pairs.


We are going to have to create a generic parser for HTTP header 
field-values with this same syntax of key-pairs. With quoted-string and 
url-encoding already requirements on some helpers re-using the same 
parser routine to output key-pair and field-value is not going to be an 
issue.





If you want one format for all, you probably need something like


 [channel-ID] (OK/ERR/BH) [key-pairs] [BS body] terminator

where BS is some special marker that starts all bodies and cannot 
be

found in key-pairs. Since any body-carrying message is likely to have
new lines (and, hence, would need a non-newline terminator), this BS
sequence could be something like body:\n, I guess.


We REQUIRE a format that does not rely on new special markers like that 
in order to cope with the existing helpers 

Generic helper I/O format

2012-07-03 Thread Alex Rousskov
On 07/03/2012 08:10 PM, Amos Jeffries wrote:

 [channel-ID] (OK/ERR/BH) [key-pairs] blob terminator

 How will the generic code be able to tell where key-pairs end and blob
 begins?
 
 The generic code will have a format for key-pair of ( 1#token =
 1#(token / quoted-string) )  [to be finalized still] with a set of known
 key names. If the bytes being parsed do not match that format and keys
 its not part of the key-pairs.

There are two problems with this approach, IMO:

  * It cannot handle a body that just happened to start with supported
key-value pair characters.

  * It cannot detect a case of helper sending unsupported key-value
pairs (if that helper may send a body too).

It also requires extra code/work to pre-register all known key names,
but that is minor.


 Some issues being resolved around char* vs
 String vs MemBuf (vs Sbuf?) for blob. Sort of settling on MemBuf for the
 ssl_crtd \0 handling requirements.

I agree that we should use MemBuf for this.


 The plan for patch-#2 is to discuss key-pair syntax (which we seem to
 have started), and move some particular key-pair parsing from helper
 handlers into the shared one. Opening up user=, tag=, password=, log=,
 message= keys to non-ACL helper responses will be VERY useful for RADIUS
 and systems needing to maintain cross-helper request state.

I think we should keep it as simple as possible without causing problems
with existing parsers and leaving some room for future extensions:

* key/name: alphanumeric char followed by anything except (whitespace or
'=' or terminator)

* value: either quoted-string or anything except (whitespace or
terminator); could be empty

The parser strips whitespace and strips quotes around quoted strings.
Quoted strings use backslash to escape any octet.


 If you want one format for all, you probably need something like

  [channel-ID] (OK/ERR/BH) [key-pairs] [BS body] terminator

 where BS is some special marker that starts all bodies and cannot be
 found in key-pairs. Since any body-carrying message is likely to have
 new lines (and, hence, would need a non-newline terminator), this BS
 sequence could be something like body:\n, I guess.
 
 We REQUIRE a format that does not rely on new special markers like that
 in order to cope with the existing helpers responses in a
 backward-compatible way without additional user configuration.

When your parser sees n=v  after ERR and n is a known key name, how
would it distinguish these two possibilities:

  * This is a start of a blob
  * This is a key=value pair


 Or would you advocate moving the external_acl_type protocol=N.N config
 option into the helper child object for use by all helpers on
 determining what key-pairs and format is accepted?

I am probably missing some important use cases, but my generic parser
would just call a virtual method for each key=value pair found (with no
pre-registration of supported keys) and another virtual method for the
body blob, if any. It will parse everything using the key, value, and
BS formats discussed above.


HTH,

Alex.


Re: Couple pointers

2012-07-03 Thread Amos Jeffries

On 4/07/2012 6:04 a.m., Alex Rousskov wrote:

On 07/02/2012 05:42 PM, Amos Jeffries wrote:


  ... is anyone able to come up with a script or tester to detect classes
which violate our Big-3 constructor/destructor/assignment operator
guideline? I know there are portions of the code which are broken today
and it is easily overlooked, this would be a nice one to enforce
automatically.

Hi Amos,

 Yes, Measurement Factory has purchased a Coverity static analysis
license to automatically detect bugs like the ones you describe. I have
run one Squid test during evaluation, and the tool did detect missing
constructor/destructor/assignment operators (among hundreds of other
bugs and non-bugs).

We are waiting for Factory sysadmin to set the tool up and for Squid
Project sysadmin to setup [automated] Squid tests the results of which
can then be somehow shared with Squid developers (we did get Coverity
permission to do that). I hope this will be done in July.


HTH,

Alex.


Sweet. I wondered what happend with Coverity. We had an FOSS account 
there years ago but after the ring level changes they stopped contacting 
us and even scanning recent sources, kept scanning daily but only on old 
sources.


Amos