On Tue, 30 Aug 2011 21:46:05 -0600, Alex Rousskov wrote:
On 08/30/2011 09:00 PM, Amos Jeffries wrote:
On Tue, 30 Aug 2011 15:55:56 -0600, Alex Rousskov wrote:
On 08/28/2011 01:10 PM, Amos Jeffries wrote:
On 29/08/11 06:39, Tsantilas Christos wrote:
On 08/27/2011 08:03 PM, Amos Jeffries wrote:
On 28/08/11 02:50, Tsantilas Christos wrote:
%>la for intercepted connections
This patch adjusts the %>la logformat code handling for
intercepted
connections
based on the following rules:
- If the corresponding http_port or https_port option has an
explicit
listening host name or IP address, then log the IP address.
- Otherwise, log a dash character.
Also adjusts %>lp logformat code handling for intercepted
connections to
always
log the port number from the corresponding http_port or
https_port
option.
+1. Looks fine.
Amos
I will commit this patch to trunk if there is not any objection.
PS. I forgot to mention that this is a Measurement Factory
project.
This whole thing itches a worry in the back of my mind. Updating
the
release notes about %>la creation today makes me realize what it
is.
We are using ">" on tags to indicate incoming things,
I do not think that part is accurate. I will try to provide a
better
definition below.
usually state
shared with the clients view of the world. This change makes the
tag
loose that overlap with the clients world view on intercepted
traffic.
What do you think about resurrecting %la / %lp for this data
instead?
I think ">" is the right choice here because we are logging the
Squid
address where the client has connected to:
">" means information related to the client-Squid connection
"<" means information related to the Squid-server connection
Yes. And lack of it appears to be consistently representing squid
view
of something regardless of whether it was client or server.
... Such as the config port a transaction came through. ie "%la"
"l" means information related to the Squid side of a connection
and _that_ is what this patch breaks. Or rather obfuscates for
intercepted traffic.
Thus,
">l" means information related to the Squid side of a client-Squid
connection, and that is what we want to log.
Which worries me. I agreed to it earlier on grounds that is was
squid
outward view of the connection. But taking a closer look at the
concepts
and documentation vs the patch the misgivings comes back.
The patch changes meaning of that definition from "local address" to
"listening address".
Yes, for intercepted connections. Listening address is a local
address.
"local address" ("the Squid side of a client-Squid connection") at
the
connection/TCP/IP level is what al->tcpClient contains right now,
before
patching. The actual real client->Squid connections IP:port.
If we are to go into these low-level details, one could argue that
there
is no actual/real client-Squid connection at all because the client
does
not think it is talking to Squid.
Meaning our definition for the "l" is a bit wrong here.
Consider there are two FD involved with each connection and how we
handle those.
FD 1 is listening, it has la of ::, and lp of 3129. no remote.
FD 2 is a connection received on that. It has local=10.0.0.1:80
remote=192.168.0.52:123
FD 3 is listening, it has la of 192.168.0.1, and lp of 3128. no
remote.
FD 4 is a connection received on that. It has
local=192.168.0.1:3128
remote=192.168.0.52:456
now the details as you describe:
">" means information related to the client-Squid connection
... AIUI that would be FD 2 and FD 4.
"l" means information related to the Squid side of a connection
... AIUI that would be from FD 4 : 192.168.0.1 (>la) and 3128 (>lp)
BUT you want FD 3 local and FD 4 remote to log here. Why not also
log FD
1 local and FD 2 remote on their line? they are the same "the Squid
side
of a client-Squid connection" by that definition.
I do not fully understand your specific examples. I see no relevant
differences between FD1-2 and FD3-4 groups, and I do not understand
how
a single connection can have four Squid descriptors associated with
it.
That was two connections. I find the fact that you could not tell them
apart indicative of the problem we are discussing.
Pair 1+2 was "http_port 3129 intercept" and connection arriving there.
Pair 3+4 was "http_port 3128" and connection arriving there.
My goal here is consistency and clarity of individual tokens. These
are
about to be used to dynamically generate redirected URLs in
deny_info
and error page texts.
I suggested %la / %lp since they seem more fuzzy on where the
details
comes from without > or < claims. Seems a perfect fit for local
squid
view of something equally fuzzy. Along the lines of how we use %un
for
"any username we can find" as opposed to the specific sources.
AND they have the extra benefit of previously being used to log the
config IP:port by older Squid (under the conditions you want to make
>la
do so). Reviving them with this more consistent definitive content
would
technically just be a policy change on their removal. Keeping the
policy
decision to _move_ origdst over to >la, leaving cases like Linux
DNAT
where both have valid non-identical details.
The alternative that occurs to me is our recent use of %S_ where "S"
means Squid. Also a perfect fit by the definitions. But not as
easily
backward compatible.
I believe that since connection is intercepted, it is in the gray
area
and many conflicting things will be "kind of true" about it.
If you insist on %la, and Christos is fine with that, let's add %la
that
does what Christos implemented for %>la and also log a dash for %>la
when the connection is intercepted.
While the above adds more work, what is critical for me, based on
user
requests, is that a single logformat option records actual Squid
address
for non-intercepted connections and specified Squid http_port address
for intercepted connections.
My understanding is that such functionality is needed in environments
where Squid handles regular and intercepted requests on multiple
http_ports and where billing and similar needs require the knowledge
of
the port handling each transaction.
Seems weird design to bill on squid listening port rather than client
IP. Smells like a system that insists on "http_access allow all" at the
top of the config as well.
Amos