On Tue, 30 Aug 2011 21:46:05 -0600, Alex Rousskov wrote:
On 08/30/2011 09:00 PM, Amos Jeffries wrote:
On Tue, 30 Aug 2011 15:55:56 -0600, Alex Rousskov wrote:
On 08/28/2011 01:10 PM, Amos Jeffries wrote:
On 29/08/11 06:39, Tsantilas Christos wrote:
On 08/27/2011 08:03 PM, Amos Jeffries wrote:
On 28/08/11 02:50, Tsantilas Christos wrote:
%>la for intercepted connections

This patch adjusts the %>la logformat code handling for intercepted
connections
based on the following rules:
- If the corresponding http_port or https_port option has an explicit
listening host name or IP address, then log the IP address.
- Otherwise, log a dash character.

Also adjusts %>lp logformat code handling for intercepted
connections to
always
log the port number from the corresponding http_port or https_port
option.

+1. Looks fine.

Amos

I will commit this patch to trunk if there is not any objection.


PS. I forgot to mention that this is a Measurement Factory project.


This whole thing itches a worry in the back of my mind. Updating the release notes about %>la creation today makes me realize what it is.

 We are using ">" on tags to indicate incoming things,

I do not think that part is accurate. I will try to provide a better
definition below.

usually state
shared with the clients view of the world. This change makes the tag loose that overlap with the clients world view on intercepted traffic.

What do you think about resurrecting %la / %lp for this data instead?

I think ">" is the right choice here because we are logging the Squid
address where the client has connected to:

">" means information related to the client-Squid connection
"<" means information related to the Squid-server connection


Yes. And lack of it appears to be consistently representing squid view
of something regardless of whether it was client or server.

... Such as the config port a transaction came through. ie "%la"

"l" means information related to the Squid side of a connection

and _that_ is what this patch breaks. Or rather obfuscates for
intercepted traffic.


Thus,

">l" means information related to the Squid side of a client-Squid
connection, and that is what we want to log.


Which worries me. I agreed to it earlier on grounds that is was squid outward view of the connection. But taking a closer look at the concepts
and documentation vs the patch the misgivings comes back.

The patch changes meaning of that definition from "local address" to
"listening address".

Yes, for intercepted connections. Listening address is a local address.


"local address" ("the Squid side of a client-Squid connection") at the connection/TCP/IP level is what al->tcpClient contains right now, before
patching. The actual real client->Squid connections IP:port.

If we are to go into these low-level details, one could argue that there is no actual/real client-Squid connection at all because the client does
not think it is talking to Squid.


Meaning our definition for the "l" is a bit wrong here.

Consider there are two FD involved with each connection and how we
handle those.
 FD 1 is listening, it has la of ::, and lp of 3129. no remote.
 FD 2 is a connection received on that. It has local=10.0.0.1:80
remote=192.168.0.52:123

FD 3 is listening, it has la of 192.168.0.1, and lp of 3128. no remote. FD 4 is a connection received on that. It has local=192.168.0.1:3128
remote=192.168.0.52:456

now the details as you describe:

">" means information related to the client-Squid connection

... AIUI that would be FD 2 and FD 4.

"l" means information related to the Squid side of a connection

... AIUI that would be from FD 4 : 192.168.0.1 (>la) and 3128 (>lp)

BUT you want FD 3 local and FD 4 remote to log here. Why not also log FD 1 local and FD 2 remote on their line? they are the same "the Squid side
of a client-Squid connection" by that definition.

I do not fully understand your specific examples. I see no relevant
differences between FD1-2 and FD3-4 groups, and I do not understand how a single connection can have four Squid descriptors associated with it.

That was two connections. I find the fact that you could not tell them apart indicative of the problem we are discussing.

Pair 1+2 was "http_port 3129 intercept" and connection arriving there.
Pair 3+4 was "http_port 3128" and connection arriving there.


My goal here is consistency and clarity of individual tokens. These are about to be used to dynamically generate redirected URLs in deny_info
and error page texts.


I suggested %la / %lp since they seem more fuzzy on where the details comes from without > or < claims. Seems a perfect fit for local squid view of something equally fuzzy. Along the lines of how we use %un for
"any username we can find" as opposed to the specific sources.
 AND they have the extra benefit of previously being used to log the
config IP:port by older Squid (under the conditions you want to make >la do so). Reviving them with this more consistent definitive content would technically just be a policy change on their removal. Keeping the policy decision to _move_ origdst over to >la, leaving cases like Linux DNAT
where both have valid non-identical details.

The alternative that occurs to me is our recent use of %S_ where "S"
means Squid. Also a perfect fit by the definitions. But not as easily
backward compatible.


I believe that since connection is intercepted, it is in the gray area
and many conflicting things will be "kind of true" about it.

If you insist on %la, and Christos is fine with that, let's add %la that
does what Christos implemented for %>la and also log a dash for %>la
when the connection is intercepted.

While the above adds more work, what is critical for me, based on user requests, is that a single logformat option records actual Squid address
for non-intercepted connections and specified Squid http_port address
for intercepted connections.

My understanding is that such functionality is needed in environments
where Squid handles regular and intercepted requests on multiple
http_ports and where billing and similar needs require the knowledge of
the port handling each transaction.

Seems weird design to bill on squid listening port rather than client IP. Smells like a system that insists on "http_access allow all" at the top of the config as well.

Amos

Reply via email to