Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-18 Thread Amos Jeffries
On 17/02/2017 12:08 p.m., Alex Rousskov wrote:
> On 02/16/2017 02:48 AM, Amos Jeffries wrote:
>> +1. The latest patch looks like correct code to me. If you are happy
>> with it too Alex please apply. Please consider using the above text in
>> the new if-else comments.
> 
> I am still OK with this patch going in, but you should be the one
> committing it if you want to change comments. It is too easy for me to
> misinterpret something while translating your email into code comments.
> 
> Alex.
> 

Applied as v5 rev.15055.

Amos

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-16 Thread Alex Rousskov
On 02/16/2017 02:48 AM, Amos Jeffries wrote:
> +1. The latest patch looks like correct code to me. If you are happy
> with it too Alex please apply. Please consider using the above text in
> the new if-else comments.

I am still OK with this patch going in, but you should be the one
committing it if you want to change comments. It is too easy for me to
misinterpret something while translating your email into code comments.

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-16 Thread Amos Jeffries
On 15/02/2017 7:04 p.m., Alex wrote:
> Thank you. Actually, r12742.1.41 was a bit strange because it covered
> only one case and caused errors for other two (despite the idea was
> right).
> 
> I understood why you said that FTP relay was reported to be working:
> forward mode was doing well and proxy-aware FTP client did the trick.
> However, I tried to use the feature in interception mode. Docs said
> that it should work too, but it appeared that it's not that easy.
> 

Sorry I was not able to reply earlier. You seem to have sorted out the
cases anyhow. This is just to clarify the actual reasons behind the cases:


* T(ransparent )PROXY

The ctrl->local IP should be used (transparency). It is a remote IP
which Squid is allowed to spoof bind() via the *_TRANSPARENT flags.


* FTP explicit/forward proxy

The ctrl->local IP should be used. The client is talking directly to
Squid and it is the IP which the client knows about. The
COMM_TRANSPARENT flag is *not* set, but as we are not spoofing the
bind() should work.


==> The else-condition handles both these cases. The COMM_TRANSPARENT
being passed only if set should handle the TPROXY vs explicit-proxy
differences cleanly in comm_bind().


* NAT intercept proxy

The ctrl->local IP must not be used. It is a remote IP and Squid is not
permitted to spoof/bind() it. That leaves no choice but to rely on
OS-assigned address. However, we can/should ensure the OS selects an IP
with v4/v6 type matching the ctrl->local type - since we know the client
can speak that IP version.

==> The if(INTERCEPTION)-condition handles this case.


Hope that clarifies the things.

IMO, NAT intercept only works reliably with passive-FTP. Active might
work _if_ the client is happy to use [very] different IPs for ctrl and
data channels. Since that is something which will vary by implementation
YMMV, but this patch gives it a better chance of success than status-quo.

We might be able to figure out something more complex for multi-IP
machines involving tcp_outgoing_address or a new similar directive. But
that is out of scope for this patch IMO.

+1. The latest patch looks like correct code to me. If you are happy
with it too Alex please apply. Please consider using the above text in
the new if-else comments.

Amos
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-14 Thread Alex
Thank you. Actually, r12742.1.41 was a bit strange because it covered only one 
case and caused errors for other two (despite the idea was right).

I understood why you said that FTP relay was reported to be working: forward 
mode was doing well and proxy-aware FTP client did the trick. However, I tried 
to use the feature in interception mode. Docs said that it should work too, but 
it appeared that it's not that easy.

15.02.2017, 01:10, "Alex Rousskov" :
> On 02/14/2017 02:39 PM, Alex wrote:
>>  Should be a bit better now.
>
> Thank you for covering all three cases and overcoming your conviction
> that one of them does not exist and that r12742.1.41 is bogus/useless.
>
> The IPv4 part still looks bad to me, but I do not think that _you_ have
> to be responsible for improving that copied code.
>
> I have no further comments about this fix and hope that somebody who
> knows more about interception than I do will look at it before committing.
>
> Thanks again,
>
> Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-14 Thread Alex Rousskov
On 02/14/2017 02:39 PM, Alex wrote:
> Should be a bit better now.

Thank you for covering all three cases and overcoming your conviction
that one of them does not exist and that r12742.1.41 is bogus/useless.

The IPv4 part still looks bad to me, but I do not think that _you_ have
to be responsible for improving that copied code.


I have no further comments about this fix and hope that somebody who
knows more about interception than I do will look at it before committing.


Thanks again,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-14 Thread Alex
Also, r12742.1.41 works in proxy mode because data connection originates from local address, this case simply does not require COMM_TRANSPARENT. The only thing the patch may break in this case is when squid box has more than one suitable address. Probably this case should be addressed.14.02.2017, 23:59, "Alex" :The problem is that r12742.1.41 is bogus and useless at the same time due to the following reasons:1. To bind some non-local IP address one needs COMM_TRANSPARENT flag which will trigger IP_TRANSPARENT socket option. The flag was not set, therefore attempts to bind any non-local address are bound to fail.2. For COMM_TRANSPARENT we need CAP_NET_ADMIN, but this capability is set only when 'ftp_port' option is set to 'tproxy' Taking this into account it may be concluded that the patch doesn't break this [tproxy] logic in any way.14.02.2017, 23:23, "Alex Rousskov" :On 02/14/2017 12:25 PM, Alex wrote: It seems that the patch doesn't make things worse (if I understood you correctly).AFAICT, you are not testing the use case that prompted [trunk] revision12742.1.41 changes. There are three possibilities:  1. r12742.1.41 changes were bogus/useless.  2. Your changes break use cases supported by r12742.1.41 changes.  3. Your changes do not break use cases supported by r12742.1.41 because all of them have a COMM_TRANSPARENT flag set.To know which outcome is correct, we need a test case that:  A. Does not work without r12742.1.41 logic.  B. Works with the current (i.e., post-r12742.1.41) unpatched code.  C. Does not have a COMM_TRANSPARENT flag set.If such a test case exists, then #2 is correct. Unfortunately, I do nothave a ready-to-use instructions on how to construct such a use case,but it probably has to involve:* direct/non-intercepted control connection (hence, no COMM_TRANSPARENTflag if you are testing patched Squid code);* multiple IP addresses from which Squid can open FTP data connectionsto the FTP client (i.e., the FTP client IP has to be reachable from theSquid box from two different source IP addresses);* TCP stack selecting the wrong default source IP address whenestablishing a connection from Squid to the FTP client.When all of the above bullets are satisfied, r12742.1.41 starts tomatter in the patched code, meeting conditions A-C above. AFAICT, yourcurrent test cases do not satisfy the last two bullets.In other words, you patch removes setting the source IP address of theSquid data connection to the destination address of the controlconnection when there is no COMM_TRANSPARENT flag. Squid r12742.1.41claims that such setting is necessary for some use cases to work. It isnot clear whether r12742.1.41 is only relevant to COMM_TRANSPARENTconnections. If it is, then your patch does not break things. If it isnot, then your patch probably does.Hope this clarifies,Alex.,___squid-dev mailing listsquid-dev@lists.squid-cache.orghttp://lists.squid-cache.org/listinfo/squid-dev___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-14 Thread Alex
The problem is that r12742.1.41 is bogus and useless at the same time due to the following reasons:1. To bind some non-local IP address one needs COMM_TRANSPARENT flag which will trigger IP_TRANSPARENT socket option. The flag was not set, therefore attempts to bind any non-local address are bound to fail.2. For COMM_TRANSPARENT we need CAP_NET_ADMIN, but this capability is set only when 'ftp_port' option is set to 'tproxy' Taking this into account it may be concluded that the patch doesn't break this [tproxy] logic in any way.14.02.2017, 23:23, "Alex Rousskov" :On 02/14/2017 12:25 PM, Alex wrote: It seems that the patch doesn't make things worse (if I understood you correctly).AFAICT, you are not testing the use case that prompted [trunk] revision12742.1.41 changes. There are three possibilities:  1. r12742.1.41 changes were bogus/useless.  2. Your changes break use cases supported by r12742.1.41 changes.  3. Your changes do not break use cases supported by r12742.1.41 because all of them have a COMM_TRANSPARENT flag set.To know which outcome is correct, we need a test case that:  A. Does not work without r12742.1.41 logic.  B. Works with the current (i.e., post-r12742.1.41) unpatched code.  C. Does not have a COMM_TRANSPARENT flag set.If such a test case exists, then #2 is correct. Unfortunately, I do nothave a ready-to-use instructions on how to construct such a use case,but it probably has to involve:* direct/non-intercepted control connection (hence, no COMM_TRANSPARENTflag if you are testing patched Squid code);* multiple IP addresses from which Squid can open FTP data connectionsto the FTP client (i.e., the FTP client IP has to be reachable from theSquid box from two different source IP addresses);* TCP stack selecting the wrong default source IP address whenestablishing a connection from Squid to the FTP client.When all of the above bullets are satisfied, r12742.1.41 starts tomatter in the patched code, meeting conditions A-C above. AFAICT, yourcurrent test cases do not satisfy the last two bullets.In other words, you patch removes setting the source IP address of theSquid data connection to the destination address of the controlconnection when there is no COMM_TRANSPARENT flag. Squid r12742.1.41claims that such setting is necessary for some use cases to work. It isnot clear whether r12742.1.41 is only relevant to COMM_TRANSPARENTconnections. If it is, then your patch does not break things. If it isnot, then your patch probably does.Hope this clarifies,Alex.___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-14 Thread Alex Rousskov
On 02/14/2017 12:25 PM, Alex wrote:
> It seems that the patch doesn't make things worse (if I understood you
> correctly).

AFAICT, you are not testing the use case that prompted [trunk] revision
12742.1.41 changes. There are three possibilities:

  1. r12742.1.41 changes were bogus/useless.
  2. Your changes break use cases supported by r12742.1.41 changes.
  3. Your changes do not break use cases supported by r12742.1.41
 because all of them have a COMM_TRANSPARENT flag set.

To know which outcome is correct, we need a test case that:

  A. Does not work without r12742.1.41 logic.
  B. Works with the current (i.e., post-r12742.1.41) unpatched code.
  C. Does not have a COMM_TRANSPARENT flag set.

If such a test case exists, then #2 is correct. Unfortunately, I do not
have a ready-to-use instructions on how to construct such a use case,
but it probably has to involve:

* direct/non-intercepted control connection (hence, no COMM_TRANSPARENT
flag if you are testing patched Squid code);

* multiple IP addresses from which Squid can open FTP data connections
to the FTP client (i.e., the FTP client IP has to be reachable from the
Squid box from two different source IP addresses);

* TCP stack selecting the wrong default source IP address when
establishing a connection from Squid to the FTP client.

When all of the above bullets are satisfied, r12742.1.41 starts to
matter in the patched code, meeting conditions A-C above. AFAICT, your
current test cases do not satisfy the last two bullets.


In other words, you patch removes setting the source IP address of the
Squid data connection to the destination address of the control
connection when there is no COMM_TRANSPARENT flag. Squid r12742.1.41
claims that such setting is necessary for some use cases to work. It is
not clear whether r12742.1.41 is only relevant to COMM_TRANSPARENT
connections. If it is, then your patch does not break things. If it is
not, then your patch probably does.


Hope this clarifies,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-14 Thread Alex
  14.02.2017, 21:50, "Alex Rousskov" :When I am talking about the forwarding case, I am talking about a simpleconfiguration without any redirection or other TCP/IP level tricks oreven reverse proxing of any sort: The FTP client sends FTP requestsdirectly to Squid's ftp_port (specifying the ultimate destination usingan extra @ sign). IIRC, that used to work with popular proxy-aware FTPclients when we added the ftp_port code.In that simple case, some FTP client refuse to accept the FTP dataconnection from Squid unless that data connection comes from the samesource IP address as the destination address of the corresponding FTPcontrol connection. The old "Use local IP..." code/comment were added tomake such clients happy (the changes in [trunk] revision 12742.1.41 arepretty telling, especially in the light of your changes). Just tried the following setup with squid running on localhost:21: [alg@centos64x64 ~]$ ftp 127.0.0.1Connected to 127.0.0.1 (127.0.0.1).220 Service readyName (127.0.0.1:alg): anonymous@172.17.10.30331 Anonymous access allowed, send identity (e-mail name) as password.Password:230 User logged in.Remote system type is Windows_NT.ftp> passivePassive mode off.ftp> ls200 PORT successfully converted to PASV.125 Data connection already open; Transfer starting I have also tried to connect from my Windows machine to itself via squid on a different machine with FileZilla client (proxy server was specified in settings) - works like a charm. It seems that the patch doesn't make things worse (if I understood you correctly).___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-14 Thread Alex
Added some clarifications.committer: Alexander Gozman 
timestamp: Tue 2017-02-14 21:46:28 +0300
message:
  Native FTP relay: NAT and TPROXY interception fixes.
diff:
=== modified file 'src/servers/FtpServer.cc'
--- src/servers/FtpServer.cc	2017-01-01 00:14:42 +
+++ src/servers/FtpServer.cc	2017-02-14 18:46:28 +
@@ -1455,7 +1455,22 @@
 
 // Use local IP address of the control connection as the source address
 // of the active data connection, or some clients will refuse to accept.
-conn->setAddrs(clientConnection->local, cltAddr);
+if (clientConnection->flags & COMM_TRANSPARENT) {
+conn->setAddrs(clientConnection->local, cltAddr);
+conn->flags |= COMM_TRANSPARENT;
+} else {
+// In case of NAT interception conn->local value is not set
+// because the TCP stack will automatically pick correct source
+// address for the data connection. We must only ensure that IP
+// version matches client's address.
+conn->local.setAnyAddr();
+
+if (cltAddr.isIPv4())
+conn->local.setIPv4();
+
+conn->remote = cltAddr;
+}
+
 // RFC 959 requires active FTP connections to originate from port 20
 // but that would preclude us from supporting concurrent transfers! (XXX?)
 conn->local.port(0);___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-14 Thread Alex Rousskov
On 02/14/2017 10:38 AM, Alex wrote:
>>>  + if (clientConnection->flags & COMM_TRANSPARENT) {
>>>  + conn->setAddrs(clientConnection->local, cltAddr);
>>>  + conn->flags |= COMM_TRANSPARENT;
>>>  + } else {
>>>  + // In case of NAT interception ...

>> Are there really just two cases here (tproxy and NAT)? IIRC, a "forward
>> FTP proxy" mode without any TCP/IP level redirection and address
>> rewriting tricks used to work fine, but your email and the new code may
>> be interpreted to imply otherwise. If there are indeed three supported
>> cases, then the new if-statement condition may need to be adjusted (at
>> least).

> I suppose that two cases should be enough. AFAIR, forward/reverse
> configuration is handled by firewall redirection rules.

When I am talking about the forwarding case, I am talking about a simple
configuration without any redirection or other TCP/IP level tricks or
even reverse proxing of any sort: The FTP client sends FTP requests
directly to Squid's ftp_port (specifying the ultimate destination using
an extra @ sign). IIRC, that used to work with popular proxy-aware FTP
clients when we added the ftp_port code.

In that simple case, some FTP client refuse to accept the FTP data
connection from Squid unless that data connection comes from the same
source IP address as the destination address of the corresponding FTP
control connection. The old "Use local IP..." code/comment were added to
make such clients happy (the changes in [trunk] revision 12742.1.41 are
pretty telling, especially in the light of your changes).

If my recollection is correct, then we should continue to support that
simple case. Your patch breaks that support and, AFAICT, your "Squid is
a gateway" test does not cover that scenario.


>>>  + if (conn->remote.isIPv4())
>>>  + conn->local.setIPv4();
>>
>> I know that Squid uses the same code elsewhere, and I assume this
>> "works" today, but it looks misleading to me. Do we want the local
>> address to have the same IP version as the remote address has? If yes,
>> the above code does not say that and, ideally, should be adjusted.
>> Again, I am not saying that this code does not work.

> Ok, I'll add corresponding notes there.

I was thinking about using an Ip::Address method that sets "this" IP
version to the version of the supplied IP address. You can add such a
method if we do not have it already.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-14 Thread Alex


14.02.2017, 19:06, "Alex Rousskov" :

>>  + if (clientConnection->flags & COMM_TRANSPARENT) {
>>  + conn->setAddrs(clientConnection->local, cltAddr);
>>  + conn->flags |= COMM_TRANSPARENT;
>>  + } else {
>>  + // In case of NAT interception ...
>
> Are there really just two cases here (tproxy and NAT)? IIRC, a "forward
> FTP proxy" mode without any TCP/IP level redirection and address
> rewriting tricks used to work fine, but your email and the new code may
> be interpreted to imply otherwise. If there are indeed three supported
> cases, then the new if-statement condition may need to be adjusted (at
> least).

I suppose that two cases should be enough. AFAIR, forward/reverse configuration 
is handled by firewall redirection rules.
I have tested the patch in configuration like this, keeping forward proxy in 
mind:

[ FTP Client, 1.1.1.1] <---> [ GW with Squid ] <--->  [ FTP Server, 
5.5.5.5]


>
>>  + // In case of NAT interception squid's local address
>>  + // will be used for outgoing connection.
>>  + conn->local.setAnyAddr();
>>  + conn->remote = cltAddr;
>
> Finally, it is not clear to me whether the new comment means something
> like this:
>
> * If we set conn->local to any IP address (with the right version), then
> the TCP stack will pick the correct source address for the data
> connection because we are using NAT.
>
> or something like this:
>
> * The exact conn->local value does not matter because the TCP stack will
> automatically pick the correct source address for the data connection
> when we are using NAT. Just make sure the IP version is correct.

Yes, this is the case. I will adjust the comment to make it more clear.

>>  + if (conn->remote.isIPv4())
>>  + conn->local.setIPv4();
>
> I know that Squid uses the same code elsewhere, and I assume this
> "works" today, but it looks misleading to me. Do we want the local
> address to have the same IP version as the remote address has? If yes,
> the above code does not say that and, ideally, should be adjusted.
> Again, I am not saying that this code does not work.

Ok, I'll add corresponding notes there.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-14 Thread Alex Rousskov
On 02/14/2017 03:44 AM, Alex wrote:

> The attached patch allows FTP relay to work in NAT interception mode
> and also fixes IP address binding in TPROXY mode.

Thank you for working on this bug. NAT/TPROXY address manipulation is
not my area of expertise, but I have one higher level concern and a few
less important clarification requests:


> +if (clientConnection->flags & COMM_TRANSPARENT) {
> +conn->setAddrs(clientConnection->local, cltAddr);
> +conn->flags |= COMM_TRANSPARENT;
> +} else {
> +// In case of NAT interception ...

Are there really just two cases here (tproxy and NAT)? IIRC, a "forward
FTP proxy" mode without any TCP/IP level redirection and address
rewriting tricks used to work fine, but your email and the new code may
be interpreted to imply otherwise. If there are indeed three supported
cases, then the new if-statement condition may need to be adjusted (at
least).

> +// In case of NAT interception squid's local address
> +// will be used for outgoing connection.
> +conn->local.setAnyAddr();
> +conn->remote = cltAddr;

In this context, the phrase "squid's local address" can mean a lot of
different things. Is there a more precise term we can use here? For
_example_, the existing comment uses "local IP address of the control
connection" which is a lot more specific.

Also, does the comment talk about the source or the destination address
for the "outgoing connection"? For example, the existing comment says
"as the source address of the active data connection" which is a lot
more specific.

If "outgoing connection" means "data connection", then I recommend using
the latter terminology because it is more precise (there are several
connections in this context that somebody might consider "outgoing",
depending on how they view their proxy role or network "position").


Finally, it is not clear to me whether the new comment means something
like this:

* If we set conn->local to any IP address (with the right version), then
the TCP stack will pick the correct source address for the data
connection because we are using NAT.

or something like this:

* The exact conn->local value does not matter because the TCP stack will
automatically pick the correct source address for the data connection
when we are using NAT. Just make sure the IP version is correct.

Or something else entirely (e.g., perhaps it is not the TCP stack but
some other Squid code that selects the right source address). The
passive voice in the new comment may be obscuring the intended meaning.



> +if (conn->remote.isIPv4())
> +conn->local.setIPv4();


I know that Squid uses the same code elsewhere, and I assume this
"works" today, but it looks misleading to me. Do we want the local
address to have the same IP version as the remote address has? If yes,
the above code does not say that and, ideally, should be adjusted.
Again, I am not saying that this code does not work.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] [PATCH] Native FTP relay for active FTP

2017-02-14 Thread Alex
Hello.

Recently I've noticed some issues in native FTP relay that made it unusable 
with active FTP mode (data connection was not established): 
http://lists.squid-cache.org/pipermail/squid-users/2017-February/014404.html

The attached patch allows FTP relay to work in NAT interception mode and also 
fixes IP address binding in TPROXY mode. Unfortunately, I have no idea on how 
to test TPROXY, but this patch should definitely make things better.committer: Alexander Gozman 
timestamp: Tue 2017-02-14 13:29:52 +0300
message:
  Native FTP relay: NAT and TPROXY interception fixes.
diff:
=== modified file 'src/servers/FtpServer.cc'
--- src/servers/FtpServer.cc	2017-01-01 00:14:42 +
+++ src/servers/FtpServer.cc	2017-02-14 10:29:52 +
@@ -1455,7 +1455,19 @@
 
 // Use local IP address of the control connection as the source address
 // of the active data connection, or some clients will refuse to accept.
-conn->setAddrs(clientConnection->local, cltAddr);
+if (clientConnection->flags & COMM_TRANSPARENT) {
+conn->setAddrs(clientConnection->local, cltAddr);
+conn->flags |= COMM_TRANSPARENT;
+} else {
+// In case of NAT interception squid's local address
+// will be used for outgoing connection.
+conn->local.setAnyAddr();
+conn->remote = cltAddr;
+
+if (conn->remote.isIPv4())
+conn->local.setIPv4();
+}
+
 // RFC 959 requires active FTP connections to originate from port 20
 // but that would preclude us from supporting concurrent transfers! (XXX?)
 conn->local.port(0);___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [PATCH] Native FTP Relay

2014-08-11 Thread Amos Jeffries
On 11/08/2014 11:30 a.m., Alex Rousskov wrote:
 On 08/10/2014 06:11 AM, Amos Jeffries wrote:
snip
 * please use tok.prefix(CharacterSet::ALPHA) to parse the FTP command
 instead of BlackSpace. Then explicit delimiter check to validate the input.
  - RFC 959 is clear:
   The command codes themselves are alphabetic characters terminated
by the character SP (Space) if parameters follow and Telnet-EOL
otherwise.
 
 Not changed. I do not see a compelling reason for a general purpose
 relay to police traffic by default. Squid itself does not care if there
 is some other character present in some command name. Why increase the
 chances of breaking what was working without Squid by rejecting commands
 by default?
 
 Yes, it is possible that some command that Squid does care about would
 have invalid trailing characters in it, preventing Squid from
 recognizing it, but, with your approach, Squid would have to reject that
 command anyway, so we would not break something that would work in a
 policing Squid. In the worst case, we would just make triage more difficult.
 
 If you insist on Squid rejecting RFC-invalid command names, I will
 change the code, but I suggest leaving it permissive for now.
 

I disagree be should be quite so tolerant in the current Internet
without an explicit reason. But that is not a blocker on this patch, we
can fix it later.

RFC 959 is quite explicit. Section 5.3 documents what consists a valid
command (alphabet characters only, case insensitive, 4 or fewer).
Also, the basic FTP commands are a fairly well-known set. Everything
else must be listed in FEAT - which we can filter for offerings of
commands not fitting the supported syntax.



+1. Branch looks good enough for merge now. Please do, so I can re-work
the other concurrent submissions on the new
Tokenizer/CharacterSet/ConnStateData APIs.

Amos



Re: [PATCH] Native FTP Relay

2014-08-11 Thread Alex Rousskov
On 08/11/2014 05:50 AM, Amos Jeffries wrote:
 On 11/08/2014 11:30 a.m., Alex Rousskov wrote:
 On 08/10/2014 06:11 AM, Amos Jeffries wrote:
 snip
 * please use tok.prefix(CharacterSet::ALPHA) to parse the FTP command
 instead of BlackSpace. Then explicit delimiter check to validate the input.
  - RFC 959 is clear:
   The command codes themselves are alphabetic characters terminated
by the character SP (Space) if parameters follow and Telnet-EOL
otherwise.

 Not changed. I do not see a compelling reason for a general purpose
 relay to police traffic by default. Squid itself does not care if there
 is some other character present in some command name. Why increase the
 chances of breaking what was working without Squid by rejecting commands
 by default?

 Yes, it is possible that some command that Squid does care about would
 have invalid trailing characters in it, preventing Squid from
 recognizing it, but, with your approach, Squid would have to reject that
 command anyway, so we would not break something that would work in a
 policing Squid. In the worst case, we would just make triage more difficult.

 If you insist on Squid rejecting RFC-invalid command names, I will
 change the code, but I suggest leaving it permissive for now.

 
 I disagree be should be quite so tolerant in the current Internet
 without an explicit reason. But that is not a blocker on this patch, we
 can fix it later.
 
 RFC 959 is quite explicit. Section 5.3 documents what consists a valid
 command (alphabet characters only, case insensitive, 4 or fewer).
 Also, the basic FTP commands are a fairly well-known set. Everything
 else must be listed in FEAT - which we can filter for offerings of
 commands not fitting the supported syntax.

I view this from interoperability point of view: From that point of
view, by default, traffic on the wire is more important than an RFC
text, especially when dealing with an old protocol (with weird
implementations that cannot be changed) and an old RFC (with vague
requirements that can be easily misinterpreted or interpreted
differently). In that context, Do no harm may overrule Obey RFC.

I say this with no disrespect to IETF and without any implication that
RFCs themselves are unimportant. Needless to say, it is often developer
inability to follow an RFC that causes interoperability problems in the
first place (and may force us, proxy developers, to accommodate buggy
implementations).

Furthermore, I understand that being tolerant leads to problems on its
own. If we learn that we have to police FTP command names for some
specific good reason, I would not hesitate supporting the corresponding
changes.


 +1. Branch looks good enough for merge now.

Merged as trunk r13528. I also updated release notes in hope to minimize
your release work, but I cannot test their rendering, and you probably
still want to audit them for style (at least).

I plan to fix Tokenizer::prefix() return value for empty prefixes next.


Thank you,

Alex.



Re: [PATCH] Native FTP Relay

2014-08-10 Thread Amos Jeffries
On 10/08/2014 2:44 p.m., Alex Rousskov wrote:
 On 08/08/2014 11:13 AM, Amos Jeffries wrote:
 On 9/08/2014 4:57 a.m., Alex Rousskov wrote:
 On 08/08/2014 09:48 AM, Amos Jeffries wrote:
 Audit results (part 1):
 * Please apply CharacterSet updates separately.
 * Please apply Tokenizer API updates separately.
 * please apply src/HttpHdrCc.h changes separately.
 
 Why? If the branch is merged, they will be applied with their own/
 existing commit messages.
 
 They are updates on previous feature changesets unrelated to FTP which
 will block future parser alterations (also unrelated to FTP) being
 backported if this project takes long to stabilize.

 If they have their own commit messages then cherrypicking out of the
 branch should be relatively easy.
 
 Thank you for explaining your rationale. To minimize overheads, let's
 come back to this issue if the branch is not merged soon.
 
 
 * is FTP/1.1 really the protocol version of FTP parsed by Squid? (the
 AnyP::ProtocolVersion parameters)
 
 Okay, leave as-is but please document that as the reason for using that
 version number
 
 Done. It was actually documented in the corresponding commit message. To
 provide _source code_ documentation in one place, I added
 Ftp::ProtocolVersion() and used that everywhere instead of hard-coded
 1,1. It will be easier to track down all 1,1 users that way if
 somebody wants to change or polish this further later. See ftp/Elements.*

Thank you.
NP: I found two related spots that need converting as well. Highlighted
below in this round of audit notes.

 
 The resulting code required more changes than one may expect and is
 still a little awkward because Http::ProtocolVersion should have been
 just a function (not a class/type), and because PortCfg::setTransport
 internals did not belong to AnyP where they were placed. I did my best
 to avoid cascading changes required to fix all that by leaving
 Http::ProtocolVersion as is and moving PortCfg::setTransport() internals
 into cache_cf.cc where they can access FTP-specific code.
 
 
 and a TODO about cleaning up the message processing to
 stop assuming HTTP versions.
 
 Done in one of the many client_side.cc places where we still assume
 HTTP. Here is the new TODO:
 
 /* RFC 2616 section 10.5.6 : handle unsupported HTTP major versions 
 cleanly. */
 /* We currently only support 0.9, 1.0, 1.1 properly */
 /* TODO: move HTTP-specific processing into servers/HttpServer and such 
 */
 if ( (http_ver.major == 0  http_ver.minor != 9) ||
 (http_ver.major  1) ) {
 
 
 This part of the problem will be gone when HTTP-specific code will be
 moved to HttpServer, but the ICAP compatibility part will not go away
 regardless of any cleanup.

The ICAP compatibility issue is a bug in ICAP or specific ICAP servers
that needs fixing there. HTTP itself can already (or shortly will) have
present HTTP/2.0, h2 or h2-NN message versions to be encapsulated.


 
 in src/Server.cc:
 * this change looks semantically wrong:
-if (!doneWithServer()) {
+if (mayReadVirginReplyBody()) {
  closeServer();
 
 The change itself does not make the code look semantically wrong. On the
 semantics level, the code looks about the same before and after the change:
 
 Was: If we are not done with the server, close it.
 Now: If we are still reading from the server, close it.

It is now *apparently* skipping all the possible non-reply-reading
reasons for closing the server connection. That is semantic change...

 
 in src/Server.h:
 * I'm not sure the call order of doneWithServer() and
 mayReadVirginReplyBody() is the right way around. 
 
 Sorry, I am not sure what you mean by call order in src/Server.h
 context. That file does not determine the order of those two calls.

... but mayReadVirginReplyBody() is calling doneWithServer() in the HTTP
case. You clarified why its done below and I am certain now that it is
indeed wrong, although necessary as a temporary workaround until the
non-reply-reading cases are checked and fixed.

 
 It seems like being
 done completely with a server includes whether more reply is coming
 as one sub-condition of several (see above), not the reverse.
 
 Correct. As the commit message quoted above tries to explain, FTP needed
 to distinguish two cases that the old code merged together:
 
 * The transaction may still be receiving the virgin response (the new
 mayReadVirginReplyBody method). This is when FTP and HTTP code has to
 end communication with the server. No other solution here right now (as
 the source code comment explains).
 
 * The transaction stopped receiving the virgin response but may still be
 able to communicate with the origin server. In this case, the FTP code
 wants to keep the [pinned] _control_ connection to the origin server
 open and, hence, does not want to call closeServer().
 
 The code remains the same for HTTP because the new
 mayReadVirginReplyBody() method just calls the old doneWithServer()
 method in HTTP case.
 


Re: [PATCH] Native FTP Relay

2014-08-10 Thread Alex Rousskov
On 08/10/2014 06:11 AM, Amos Jeffries wrote:
 On 10/08/2014 2:44 p.m., Alex Rousskov wrote:
 This part of the problem will be gone when HTTP-specific code will be
 moved to HttpServer, but the ICAP compatibility part will not go away
 regardless of any cleanup.
 
 The ICAP compatibility issue is a bug in ICAP or specific ICAP servers
 that needs fixing there.

Yes, of course, but fixing those ICAP services is often beyond Squid
admin control, and there is no pressing need to break compatibility with
those ICAP services _now_ (by using a different FTP version than 1.0
or 1.1).


 HTTP itself can already (or shortly will) have
 present HTTP/2.0, h2 or h2-NN message versions to be encapsulated.

I suspect we will receive requests to optionally mask those new versions
when sent over ICAP because some broken service that the admin cannot
fix cannot handle much more than /1.0 and /1.1.


 in src/Server.cc:
 * this change looks semantically wrong:
-if (!doneWithServer()) {
+if (mayReadVirginReplyBody()) {
  closeServer();

 The change itself does not make the code look semantically wrong. On the
 semantics level, the code looks about the same before and after the change:

 Was: If we are not done with the server, close it.
 Now: If we are still reading from the server, close it.
 
 It is now *apparently* skipping all the possible non-reply-reading
 reasons for closing the server connection. That is semantic change...

If you think the above change affects something on the semantics level,
then I do not have any new arguments to convince you otherwise.
Hopefully, the diverging opinions on the code looks are not important in
this case, because we seem to agree that the patch code is correct (at
least in a sense that it does not make the known problem any worse and
actually solves one sub-problem).


 Okay, I think I understand the cases now. So the old HTTP code was
 broken by not considering the request and Trailers cases as okay even
 when ICAP closed. You are just not fixing that bug.

Yes, I am not fixing any old bugs here. As for what exactly is broken
and by how much, I prefer not to discuss those details further (here and
now) because the patch does not change them.


 Would you mind adding a TODO to the new HTTP mayReadVirginReplyBody()
 that it should be only considering reply body reads, not overall server use?

Done. There was no HTTP mayReadVirginReplyBody(), but I added one.


 in src/cache_cf.cc:
 * the new parsePortProtocol() FTP case should return
 Ftp::ProtocolVersion().

Done.


 in src/cf.data.pre:
 * mentions of ftp-track-dirs=on|off need updating

Fixed.


 in src/ftp/Elements.cc:
 * Ftp::HttpReplyWrapper() has unnecessary:
   httpVersion = Http::ProtocolVersion(Ftp::ProtocolVersion().major,
 Ftp::ProtocolVersion().minor);
 
  - the current default Http::ProtocolVersion() constructor sets the
 right values for HTTP messages generated by Squid. The old code using
 Http::ProtocolVersion(1, 1) was incorrect/outdated.
 
  - same for Ftp::Server::parseOneRequest() in src/servers/FtpServer.cc

Left as is because the Http::ProtocolVersion default is irrelevant in
those two places.

We should set the wrapping message version to what FTP code wants to use
for HTTP wrapping. That version is defined by Ftp::ProtocolVersion().
The fact that the HTTP default currently matches the version used for
FTP wrapping is a coincidence that the code should not rely on. Imagine
changing the HTTP default to 2.0 tomorrow and then spending hours trying
to find all the places where FTP was _implicitly_ relying on that
default being 1.1...


 in src/clients/Makefile.am:
 * no need for forward.h in the SOURCES list to be separated from the
 other files.
  - same in src/servers/Makefile.am

Removed.


 * please include $(top_srcdir)/src/TestHeaders.am to run the .h header
 unit tests
  - same in src/ftp/Makefile.am and src/servers/Makefile.am

Done.


 in src/servers/forward.h:
 * missing pre-define for class ConnStateData
  - this would have been caught by the .h unit tests mentioned above.

Fixed.


 in src/servers/FtpServer.cc:
 
 + Ftp::Server::parseOneRequest() for all the following until ++
 
 * please reference RFC 959 section 3.1.1.5.2 for the unusual definition
 of WhiteSpace CharacterSet.

Done. It was actually based on isspace(3) so not that unusual :-)


 * please use existing CharacterSet::LF instead of re-defining as local
 NewLine.

Done.


 * please use tok.prefix(CharacterSet::ALPHA) to parse the FTP command
 instead of BlackSpace. Then explicit delimiter check to validate the input.
  - RFC 959 is clear:
   The command codes themselves are alphabetic characters terminated
by the character SP (Space) if parameters follow and Telnet-EOL
otherwise.

Not changed. I do not see a compelling reason for a general purpose
relay to police traffic by default. Squid itself does not care if there
is some other character present in some command name. Why increase the
chances of breaking 

Re: [PATCH] Native FTP Relay

2014-08-09 Thread Alex Rousskov
On 08/08/2014 11:13 AM, Amos Jeffries wrote:
 On 9/08/2014 4:57 a.m., Alex Rousskov wrote:
 On 08/08/2014 09:48 AM, Amos Jeffries wrote:
 Audit results (part 1):
 * Please apply CharacterSet updates separately.
 * Please apply Tokenizer API updates separately.
 * please apply src/HttpHdrCc.h changes separately.

 Why? If the branch is merged, they will be applied with their own/
 existing commit messages.

 They are updates on previous feature changesets unrelated to FTP which
 will block future parser alterations (also unrelated to FTP) being
 backported if this project takes long to stabilize.
 
 If they have their own commit messages then cherrypicking out of the
 branch should be relatively easy.

Thank you for explaining your rationale. To minimize overheads, let's
come back to this issue if the branch is not merged soon.


 * is FTP/1.1 really the protocol version of FTP parsed by Squid? (the
 AnyP::ProtocolVersion parameters)

 Okay, leave as-is but please document that as the reason for using that
 version number

Done. It was actually documented in the corresponding commit message. To
provide _source code_ documentation in one place, I added
Ftp::ProtocolVersion() and used that everywhere instead of hard-coded
1,1. It will be easier to track down all 1,1 users that way if
somebody wants to change or polish this further later. See ftp/Elements.*

The resulting code required more changes than one may expect and is
still a little awkward because Http::ProtocolVersion should have been
just a function (not a class/type), and because PortCfg::setTransport
internals did not belong to AnyP where they were placed. I did my best
to avoid cascading changes required to fix all that by leaving
Http::ProtocolVersion as is and moving PortCfg::setTransport() internals
into cache_cf.cc where they can access FTP-specific code.


 and a TODO about cleaning up the message processing to
 stop assuming HTTP versions.

Done in one of the many client_side.cc places where we still assume
HTTP. Here is the new TODO:

 /* RFC 2616 section 10.5.6 : handle unsupported HTTP major versions 
 cleanly. */
 /* We currently only support 0.9, 1.0, 1.1 properly */
 /* TODO: move HTTP-specific processing into servers/HttpServer and such */
 if ( (http_ver.major == 0  http_ver.minor != 9) ||
 (http_ver.major  1) ) {


This part of the problem will be gone when HTTP-specific code will be
moved to HttpServer, but the ICAP compatibility part will not go away
regardless of any cleanup.


 in src/cf.data.pre:
 * please document tproxy, name=, protocol=FTP as supported on ftp_port

 We do not know whether it is supported and do not plan on testing/fixing
 that support right now. Do you still want us to document it as supported?

 Yes. You have not changed the TcpAcceptor code for TPROXY lookups or the
 ACL code using port name. So they are supported.

Done. Please note that protocol= requires accel mode, and I am not sure
what that mode means for a native FTP relay. I added protocol=FTP anyway
in hope to shorten the discussion.


 protocol= you explicitly added support in the parser.

This is probably a misunderstanding caused by poor official code
quality. We did not add explicit support for the protocol= option. The
parsing code we added was required for the _default_ ftp_port
configuration (no protocol=... option) to work because of the way the
general port parsing (and other) official code is written.

The official code uses CfgPort::transport for at least two rather
different purposes (for determining foo in foo_port and for the
protocol= feature). Cleaning this up requires a serious effort well
beyond the FTP project scope (the FTP code does not really need the
protocol= feature to work).

If, after reading the above clarification, you would rather see
protocol=FTP gone from squid.conf.documented, I would be more than happy
to remove it.


 in src/FwdState.cc:
 * at line 817 calling request-hier.note() twice in a row with different
 values is both needless change and wasting CPU.

Fixed? This was my merge bug. The fixed patch uses the more recent line
version (and both lines were yours, thank you).


 in src/HttpHeader.cc:
 * please enact the new TODO about stripping FTP-Arguments header on
 PASS command. 

Done to shorten the discussion.


 This information leak could earn us a CVE if merged as-is.

I do not think it could because that code is currently unused by/for FTP
relays (as the added source code comment tried to explain). Native FTP
relay does not send Squid error pages to the user and those error
pages are the only context (that I know of) where the masking code is
actually used today.


 in src/HttpHeaderTools.cc:
 * httpHeaderQuoteString() could be optimized with needInnerQuote as
 SBuf::size_type to append in blocks between characters needing quote.

Agreed. Added a corresponding TODO. I do not want to volunteer to
provide the corresponding optimization at this time and consider it
rather 

Re: [PATCH] Native FTP Relay

2014-08-08 Thread Amos Jeffries
On 8/08/2014 3:29 p.m., Alex Rousskov wrote:
 Hello,
 
 Native FTP Relay support is finally ready: Squid receives native FTP
 commands on ftp_port and relays FTP messages between FTP clients and FTP
 origin servers through the existing Squid access control, adaptation,
 and logging layers. A compressed 70KB patch attached to this email is
 also temporary available at [1]. I would like to merge the corresponding
 lp branch[2] into Squid trunk.
 
 This is a large, complex, experimental feature. I am sure there are
 cases where it does not work well or does not support some existing
 features. I am not aware of any regressions specific to old FTP gateway
 code, and I hope there are no regressions specific to HTTP code, but I
 do not recommend deployment without testing.
 
 The branch code worked quite well in limited production deployment, but
 the final version (with code restructuring and polishing for the
 official submission) has only seen simple lab testing. AFAIK, the FTP
 interception path has not seen any production deployment at all.
 
 This code represents more than a year worth of development, started by
 Dmitry Kurochkin in the beginning of 2013. Christos Tsantilas and I
 finished Dmitry's work. I bear responsibility for the bugs, but there
 are probably areas of Dmitry's code that appear to work well and that
 neither Christos nor I have studied.
 
 Overall, we tried to focus on the new FTP code and leave the old
 FTP/HTTP code alone, except where restructuring was necessary to avoid
 code duplication. For example, the server-facing side reuses a lot of
 the old FTP code, while the relayed FTP commands are passed around in
 HttpMsg structures using the old ClientStreams, doCallout(), and Store
 interfaces.
 
 If you review the patch, please note that bzr is incapable of tracking
 code across file splits (e.g., old ftp.cc becoming ftp/Parsing.cc plus
 three clients/Ftp*.cc files). Many of the old XXXs and TODOs will appear
 as newly added code in the patch. Usually, one can figure it out by
 searching for the similar but deleted code in the patch.
 
 Please see the above referenced lp branch for a detailed change log.
 
 
 Some Native FTP Relay highlights:
 
 * Added ftp_port directive telling Squid to relay native FTP commands.
 * Active and passive FTP support on the user-facing side; require
   passive connections to come from the control connection src IP.
 * IPv6 support (EPSV and, on the user-facing side, EPRT).
 * Intelligent adaptation of relayed FTP FEAT responses.
 * Relaying of multi-line FTP control responses using various formats.
 * Support relaying of FTP MLSD and MLST commands (RFC 3659).
 * Several Microsoft FTP server compatibility features.
 * ICAP/eCAP support (at individual FTP command/response level).
 * Optional current FTP directory tracking (cannot be 100% reliable due
   to symbolic links and such, but is helpful in some common use cases).
 * FTP origin control connection is pinned to the FTP user connection.
 * No caching support -- no reliable Request URIs for that (see above).
 * Significant FTP code restructuring on the server-facing side.
 * Initial steps towards HTTP code restructuring on the client-facing
   side.
 
 
 Changes to the general code used by the Native FTP Relay code:
 
 
 * The user- and origin-facing code restructured as agreed previously on
   this mailing list. I will email some thoughts about the results separately,
   but here is the executive summary:

 src/servers/FtpServer.*  # new FTP server, relaying FTP
 src/servers/HttpServer.* # old ConnStateData parts conflicting with 
 FtpServer
 src/clients/FtpClient.*  # code shared by old and new FTP clients
 src/clients/FtpGateway.* # old FTP client, translating back to HTTP
 src/clients/FtpRelay.*   # new FTP client, relaying FTP
 src/ftp/*# FTP stuff shared by clients and servers


 * Fixed HttpHdr::Private/NoCache(v) implementations and optimized their API.

 * Tokenizer fixes and API improvements:

   Taught Tokenizer to keep track of the number of parsed bytes. Many callers
   need to know that because they need to adjust/consume I/O offsets/buffers.
   
   Adjusted unused Parser::Tokenizer::token() to not treat NUL delimiter
   specially. Besides the fact that not all grammars can treat NUL that way, 
 the
   special NUL treatment resulted in some token() calls returning true for 
 empty
   tokens, which was confusing parsers. Callers that do not require trailing
   delimiters, should use prefix() instead. This change is based on experience
   writing Tokenizer-based FTP parser, although the final parser code uses
   prefix() instead of token(), for unrelated reasons.
   
   Split Parser::Tokenizer::skip(set) into skipOne() and skipAll(). All other
   skip() methods skip exactly one thing (either a given character or a given
   token) but the old skip(set) method was skipping multiple things. That
   confused callers. We now force the caller to make 

Re: [PATCH] Native FTP Relay

2014-08-08 Thread Alex Rousskov
On 08/08/2014 09:48 AM, Amos Jeffries wrote:
 On 8/08/2014 3:29 p.m., Alex Rousskov wrote:
 Hello,

 Native FTP Relay support is finally ready: Squid receives native FTP
 commands on ftp_port and relays FTP messages between FTP clients and FTP
 origin servers through the existing Squid access control, adaptation,
 and logging layers. A compressed 70KB patch attached to this email is
 also temporary available at [1]. I would like to merge the corresponding
 lp branch[2] into Squid trunk.

 This is a large, complex, experimental feature. I am sure there are
 cases where it does not work well or does not support some existing
 features. I am not aware of any regressions specific to old FTP gateway
 code, and I hope there are no regressions specific to HTTP code, but I
 do not recommend deployment without testing.

 The branch code worked quite well in limited production deployment, but
 the final version (with code restructuring and polishing for the
 official submission) has only seen simple lab testing. AFAIK, the FTP
 interception path has not seen any production deployment at all.

 This code represents more than a year worth of development, started by
 Dmitry Kurochkin in the beginning of 2013. Christos Tsantilas and I
 finished Dmitry's work. I bear responsibility for the bugs, but there
 are probably areas of Dmitry's code that appear to work well and that
 neither Christos nor I have studied.

 Overall, we tried to focus on the new FTP code and leave the old
 FTP/HTTP code alone, except where restructuring was necessary to avoid
 code duplication. For example, the server-facing side reuses a lot of
 the old FTP code, while the relayed FTP commands are passed around in
 HttpMsg structures using the old ClientStreams, doCallout(), and Store
 interfaces.

 If you review the patch, please note that bzr is incapable of tracking
 code across file splits (e.g., old ftp.cc becoming ftp/Parsing.cc plus
 three clients/Ftp*.cc files). Many of the old XXXs and TODOs will appear
 as newly added code in the patch. Usually, one can figure it out by
 searching for the similar but deleted code in the patch.

 Please see the above referenced lp branch for a detailed change log.


 Some Native FTP Relay highlights:

 * Added ftp_port directive telling Squid to relay native FTP commands.
 * Active and passive FTP support on the user-facing side; require
   passive connections to come from the control connection src IP.
 * IPv6 support (EPSV and, on the user-facing side, EPRT).
 * Intelligent adaptation of relayed FTP FEAT responses.
 * Relaying of multi-line FTP control responses using various formats.
 * Support relaying of FTP MLSD and MLST commands (RFC 3659).
 * Several Microsoft FTP server compatibility features.
 * ICAP/eCAP support (at individual FTP command/response level).
 * Optional current FTP directory tracking (cannot be 100% reliable due
   to symbolic links and such, but is helpful in some common use cases).
 * FTP origin control connection is pinned to the FTP user connection.
 * No caching support -- no reliable Request URIs for that (see above).
 * Significant FTP code restructuring on the server-facing side.
 * Initial steps towards HTTP code restructuring on the client-facing
   side.


 Changes to the general code used by the Native FTP Relay code:


 * The user- and origin-facing code restructured as agreed previously on
   this mailing list. I will email some thoughts about the results 
 separately,
   but here is the executive summary:

 src/servers/FtpServer.*  # new FTP server, relaying FTP
 src/servers/HttpServer.* # old ConnStateData parts conflicting with 
 FtpServer
 src/clients/FtpClient.*  # code shared by old and new FTP clients
 src/clients/FtpGateway.* # old FTP client, translating back to HTTP
 src/clients/FtpRelay.*   # new FTP client, relaying FTP
 src/ftp/*# FTP stuff shared by clients and servers


 * Fixed HttpHdr::Private/NoCache(v) implementations and optimized their API.

 * Tokenizer fixes and API improvements:

   Taught Tokenizer to keep track of the number of parsed bytes. Many callers
   need to know that because they need to adjust/consume I/O offsets/buffers.
   
   Adjusted unused Parser::Tokenizer::token() to not treat NUL delimiter
   specially. Besides the fact that not all grammars can treat NUL that way, 
 the
   special NUL treatment resulted in some token() calls returning true for 
 empty
   tokens, which was confusing parsers. Callers that do not require trailing
   delimiters, should use prefix() instead. This change is based on 
 experience
   writing Tokenizer-based FTP parser, although the final parser code uses
   prefix() instead of token(), for unrelated reasons.
   
   Split Parser::Tokenizer::skip(set) into skipOne() and skipAll(). All other
   skip() methods skip exactly one thing (either a given character or a given
   token) but the old skip(set) method was skipping multiple things. That
   confused 

Re: [PATCH] Native FTP Relay

2014-08-08 Thread Amos Jeffries
On 9/08/2014 4:57 a.m., Alex Rousskov wrote:
 On 08/08/2014 09:48 AM, Amos Jeffries wrote:
snip

 Audit results (part 1):

 * Please apply CharacterSet updates separately.

 * Please apply Tokenizer API updates separately.

 * please apply src/HttpHdrCc.h changes separately.
 
 Why? If the branch is merged, they will be applied with their own/
 existing commit messages.
 

They are updates on previous feature changesets unrelated to FTP which
will block future parser alterations (also unrelated to FTP) being
backported if this project takes long to stabilize.

If they have their own commit messages then cherrypicking out of the
branch should be relatively easy.

 
 * is FTP/1.1 really the protocol version of FTP parsed by Squid? (the
 AnyP::ProtocolVersion parameters)
 ...
  - NP: 0,0 version should be usable and probably best since there does
 not appear to be any actual official version numbering for FTP.
 
 Since there is no any actual official version numbering for FTP, we
 are using what works best. Version 1.1 works best because it causes
 fewer FTP exceptions in the general code because FTP control connections
 are persistent by default, just like HTTP/1.1 connections. I think there
 is a comment about that.
 
 Using 0.0 would probably create more exceptions. 0.0 will most likely
 also screw up some ICAP services that expect /1.0 or /1.1 when parsing
 headers.
 
 After reading the above arguments, do you still want us to switch to 0.0?
 

Yuck. Messy.

Okay, leave as-is but please document that as the reason for using that
version number and a TODO about cleaning up the message processing to
stop assuming HTTP versions.

 
 in src/cf.data.pre:
 * please document tproxy, name=, protocol=FTP as supported on ftp_port
 
 We do not know whether it is supported and do not plan on testing/fixing
 that support right now. Do you still want us to document it as supported?
 

Yes. You have not changed the TcpAcceptor code for TPROXY lookups or the
ACL code using port name. So they are supported. protocol= you
explicitly added support in the parser.

 
 in src/client_side.h:
 * what is waiting for future HttpsServer home meant to mean on
 postHttpsAccept()
 
 Unlike HTTP and FTP, there is currently no object dedicated to serving
 HTTPS connections. HttpServer is misused for that on one occasion, but a
 lot more work is needed to properly move HTTPS code from ConnStateData
 into severs/HttpsServer.*  That work is unrelated to FTP.

Okay. Thanks.

Amos


[PATCH] Native FTP Relay

2014-08-07 Thread Alex Rousskov
Hello,

Native FTP Relay support is finally ready: Squid receives native FTP
commands on ftp_port and relays FTP messages between FTP clients and FTP
origin servers through the existing Squid access control, adaptation,
and logging layers. A compressed 70KB patch attached to this email is
also temporary available at [1]. I would like to merge the corresponding
lp branch[2] into Squid trunk.

This is a large, complex, experimental feature. I am sure there are
cases where it does not work well or does not support some existing
features. I am not aware of any regressions specific to old FTP gateway
code, and I hope there are no regressions specific to HTTP code, but I
do not recommend deployment without testing.

The branch code worked quite well in limited production deployment, but
the final version (with code restructuring and polishing for the
official submission) has only seen simple lab testing. AFAIK, the FTP
interception path has not seen any production deployment at all.

This code represents more than a year worth of development, started by
Dmitry Kurochkin in the beginning of 2013. Christos Tsantilas and I
finished Dmitry's work. I bear responsibility for the bugs, but there
are probably areas of Dmitry's code that appear to work well and that
neither Christos nor I have studied.

Overall, we tried to focus on the new FTP code and leave the old
FTP/HTTP code alone, except where restructuring was necessary to avoid
code duplication. For example, the server-facing side reuses a lot of
the old FTP code, while the relayed FTP commands are passed around in
HttpMsg structures using the old ClientStreams, doCallout(), and Store
interfaces.

If you review the patch, please note that bzr is incapable of tracking
code across file splits (e.g., old ftp.cc becoming ftp/Parsing.cc plus
three clients/Ftp*.cc files). Many of the old XXXs and TODOs will appear
as newly added code in the patch. Usually, one can figure it out by
searching for the similar but deleted code in the patch.

Please see the above referenced lp branch for a detailed change log.


Some Native FTP Relay highlights:

* Added ftp_port directive telling Squid to relay native FTP commands.
* Active and passive FTP support on the user-facing side; require
  passive connections to come from the control connection src IP.
* IPv6 support (EPSV and, on the user-facing side, EPRT).
* Intelligent adaptation of relayed FTP FEAT responses.
* Relaying of multi-line FTP control responses using various formats.
* Support relaying of FTP MLSD and MLST commands (RFC 3659).
* Several Microsoft FTP server compatibility features.
* ICAP/eCAP support (at individual FTP command/response level).
* Optional current FTP directory tracking (cannot be 100% reliable due
  to symbolic links and such, but is helpful in some common use cases).
* FTP origin control connection is pinned to the FTP user connection.
* No caching support -- no reliable Request URIs for that (see above).
* Significant FTP code restructuring on the server-facing side.
* Initial steps towards HTTP code restructuring on the client-facing
  side.


Changes to the general code used by the Native FTP Relay code:


 * The user- and origin-facing code restructured as agreed previously on
   this mailing list. I will email some thoughts about the results separately,
   but here is the executive summary:
 
 src/servers/FtpServer.*  # new FTP server, relaying FTP
 src/servers/HttpServer.* # old ConnStateData parts conflicting with 
 FtpServer
 src/clients/FtpClient.*  # code shared by old and new FTP clients
 src/clients/FtpGateway.* # old FTP client, translating back to HTTP
 src/clients/FtpRelay.*   # new FTP client, relaying FTP
 src/ftp/*# FTP stuff shared by clients and servers
 
 
 * Fixed HttpHdr::Private/NoCache(v) implementations and optimized their API.
 
 * Tokenizer fixes and API improvements:
 
   Taught Tokenizer to keep track of the number of parsed bytes. Many callers
   need to know that because they need to adjust/consume I/O offsets/buffers.
   
   Adjusted unused Parser::Tokenizer::token() to not treat NUL delimiter
   specially. Besides the fact that not all grammars can treat NUL that way, 
 the
   special NUL treatment resulted in some token() calls returning true for 
 empty
   tokens, which was confusing parsers. Callers that do not require trailing
   delimiters, should use prefix() instead. This change is based on experience
   writing Tokenizer-based FTP parser, although the final parser code uses
   prefix() instead of token(), for unrelated reasons.
   
   Split Parser::Tokenizer::skip(set) into skipOne() and skipAll(). All other
   skip() methods skip exactly one thing (either a given character or a given
   token) but the old skip(set) method was skipping multiple things. That
   confused callers. We now force the caller to make a choice. 
   
   Fixed Parser::Tokenizer::skip(char) to avoid out of bound access.
 
 
 * CharacterSet