Re: [PATCH] Native FTP Relay

2014-08-11 Thread Amos Jeffries
On 11/08/2014 11:30 a.m., Alex Rousskov wrote:
 On 08/10/2014 06:11 AM, Amos Jeffries wrote:
snip
 * please use tok.prefix(CharacterSet::ALPHA) to parse the FTP command
 instead of BlackSpace. Then explicit delimiter check to validate the input.
  - RFC 959 is clear:
   The command codes themselves are alphabetic characters terminated
by the character SP (Space) if parameters follow and Telnet-EOL
otherwise.
 
 Not changed. I do not see a compelling reason for a general purpose
 relay to police traffic by default. Squid itself does not care if there
 is some other character present in some command name. Why increase the
 chances of breaking what was working without Squid by rejecting commands
 by default?
 
 Yes, it is possible that some command that Squid does care about would
 have invalid trailing characters in it, preventing Squid from
 recognizing it, but, with your approach, Squid would have to reject that
 command anyway, so we would not break something that would work in a
 policing Squid. In the worst case, we would just make triage more difficult.
 
 If you insist on Squid rejecting RFC-invalid command names, I will
 change the code, but I suggest leaving it permissive for now.
 

I disagree be should be quite so tolerant in the current Internet
without an explicit reason. But that is not a blocker on this patch, we
can fix it later.

RFC 959 is quite explicit. Section 5.3 documents what consists a valid
command (alphabet characters only, case insensitive, 4 or fewer).
Also, the basic FTP commands are a fairly well-known set. Everything
else must be listed in FEAT - which we can filter for offerings of
commands not fitting the supported syntax.



+1. Branch looks good enough for merge now. Please do, so I can re-work
the other concurrent submissions on the new
Tokenizer/CharacterSet/ConnStateData APIs.

Amos



Re: [PATCH] Native FTP Relay

2014-08-11 Thread Alex Rousskov
On 08/11/2014 05:50 AM, Amos Jeffries wrote:
 On 11/08/2014 11:30 a.m., Alex Rousskov wrote:
 On 08/10/2014 06:11 AM, Amos Jeffries wrote:
 snip
 * please use tok.prefix(CharacterSet::ALPHA) to parse the FTP command
 instead of BlackSpace. Then explicit delimiter check to validate the input.
  - RFC 959 is clear:
   The command codes themselves are alphabetic characters terminated
by the character SP (Space) if parameters follow and Telnet-EOL
otherwise.

 Not changed. I do not see a compelling reason for a general purpose
 relay to police traffic by default. Squid itself does not care if there
 is some other character present in some command name. Why increase the
 chances of breaking what was working without Squid by rejecting commands
 by default?

 Yes, it is possible that some command that Squid does care about would
 have invalid trailing characters in it, preventing Squid from
 recognizing it, but, with your approach, Squid would have to reject that
 command anyway, so we would not break something that would work in a
 policing Squid. In the worst case, we would just make triage more difficult.

 If you insist on Squid rejecting RFC-invalid command names, I will
 change the code, but I suggest leaving it permissive for now.

 
 I disagree be should be quite so tolerant in the current Internet
 without an explicit reason. But that is not a blocker on this patch, we
 can fix it later.
 
 RFC 959 is quite explicit. Section 5.3 documents what consists a valid
 command (alphabet characters only, case insensitive, 4 or fewer).
 Also, the basic FTP commands are a fairly well-known set. Everything
 else must be listed in FEAT - which we can filter for offerings of
 commands not fitting the supported syntax.

I view this from interoperability point of view: From that point of
view, by default, traffic on the wire is more important than an RFC
text, especially when dealing with an old protocol (with weird
implementations that cannot be changed) and an old RFC (with vague
requirements that can be easily misinterpreted or interpreted
differently). In that context, Do no harm may overrule Obey RFC.

I say this with no disrespect to IETF and without any implication that
RFCs themselves are unimportant. Needless to say, it is often developer
inability to follow an RFC that causes interoperability problems in the
first place (and may force us, proxy developers, to accommodate buggy
implementations).

Furthermore, I understand that being tolerant leads to problems on its
own. If we learn that we have to police FTP command names for some
specific good reason, I would not hesitate supporting the corresponding
changes.


 +1. Branch looks good enough for merge now.

Merged as trunk r13528. I also updated release notes in hope to minimize
your release work, but I cannot test their rendering, and you probably
still want to audit them for style (at least).

I plan to fix Tokenizer::prefix() return value for empty prefixes next.


Thank you,

Alex.



Re: [PATCH] Native FTP Relay

2014-08-10 Thread Amos Jeffries
On 10/08/2014 2:44 p.m., Alex Rousskov wrote:
 On 08/08/2014 11:13 AM, Amos Jeffries wrote:
 On 9/08/2014 4:57 a.m., Alex Rousskov wrote:
 On 08/08/2014 09:48 AM, Amos Jeffries wrote:
 Audit results (part 1):
 * Please apply CharacterSet updates separately.
 * Please apply Tokenizer API updates separately.
 * please apply src/HttpHdrCc.h changes separately.
 
 Why? If the branch is merged, they will be applied with their own/
 existing commit messages.
 
 They are updates on previous feature changesets unrelated to FTP which
 will block future parser alterations (also unrelated to FTP) being
 backported if this project takes long to stabilize.

 If they have their own commit messages then cherrypicking out of the
 branch should be relatively easy.
 
 Thank you for explaining your rationale. To minimize overheads, let's
 come back to this issue if the branch is not merged soon.
 
 
 * is FTP/1.1 really the protocol version of FTP parsed by Squid? (the
 AnyP::ProtocolVersion parameters)
 
 Okay, leave as-is but please document that as the reason for using that
 version number
 
 Done. It was actually documented in the corresponding commit message. To
 provide _source code_ documentation in one place, I added
 Ftp::ProtocolVersion() and used that everywhere instead of hard-coded
 1,1. It will be easier to track down all 1,1 users that way if
 somebody wants to change or polish this further later. See ftp/Elements.*

Thank you.
NP: I found two related spots that need converting as well. Highlighted
below in this round of audit notes.

 
 The resulting code required more changes than one may expect and is
 still a little awkward because Http::ProtocolVersion should have been
 just a function (not a class/type), and because PortCfg::setTransport
 internals did not belong to AnyP where they were placed. I did my best
 to avoid cascading changes required to fix all that by leaving
 Http::ProtocolVersion as is and moving PortCfg::setTransport() internals
 into cache_cf.cc where they can access FTP-specific code.
 
 
 and a TODO about cleaning up the message processing to
 stop assuming HTTP versions.
 
 Done in one of the many client_side.cc places where we still assume
 HTTP. Here is the new TODO:
 
 /* RFC 2616 section 10.5.6 : handle unsupported HTTP major versions 
 cleanly. */
 /* We currently only support 0.9, 1.0, 1.1 properly */
 /* TODO: move HTTP-specific processing into servers/HttpServer and such 
 */
 if ( (http_ver.major == 0  http_ver.minor != 9) ||
 (http_ver.major  1) ) {
 
 
 This part of the problem will be gone when HTTP-specific code will be
 moved to HttpServer, but the ICAP compatibility part will not go away
 regardless of any cleanup.

The ICAP compatibility issue is a bug in ICAP or specific ICAP servers
that needs fixing there. HTTP itself can already (or shortly will) have
present HTTP/2.0, h2 or h2-NN message versions to be encapsulated.


 
 in src/Server.cc:
 * this change looks semantically wrong:
-if (!doneWithServer()) {
+if (mayReadVirginReplyBody()) {
  closeServer();
 
 The change itself does not make the code look semantically wrong. On the
 semantics level, the code looks about the same before and after the change:
 
 Was: If we are not done with the server, close it.
 Now: If we are still reading from the server, close it.

It is now *apparently* skipping all the possible non-reply-reading
reasons for closing the server connection. That is semantic change...

 
 in src/Server.h:
 * I'm not sure the call order of doneWithServer() and
 mayReadVirginReplyBody() is the right way around. 
 
 Sorry, I am not sure what you mean by call order in src/Server.h
 context. That file does not determine the order of those two calls.

... but mayReadVirginReplyBody() is calling doneWithServer() in the HTTP
case. You clarified why its done below and I am certain now that it is
indeed wrong, although necessary as a temporary workaround until the
non-reply-reading cases are checked and fixed.

 
 It seems like being
 done completely with a server includes whether more reply is coming
 as one sub-condition of several (see above), not the reverse.
 
 Correct. As the commit message quoted above tries to explain, FTP needed
 to distinguish two cases that the old code merged together:
 
 * The transaction may still be receiving the virgin response (the new
 mayReadVirginReplyBody method). This is when FTP and HTTP code has to
 end communication with the server. No other solution here right now (as
 the source code comment explains).
 
 * The transaction stopped receiving the virgin response but may still be
 able to communicate with the origin server. In this case, the FTP code
 wants to keep the [pinned] _control_ connection to the origin server
 open and, hence, does not want to call closeServer().
 
 The code remains the same for HTTP because the new
 mayReadVirginReplyBody() method just calls the old doneWithServer()
 method in HTTP case.
 


Re: [PATCH] Native FTP Relay

2014-08-10 Thread Alex Rousskov
On 08/10/2014 06:11 AM, Amos Jeffries wrote:
 On 10/08/2014 2:44 p.m., Alex Rousskov wrote:
 This part of the problem will be gone when HTTP-specific code will be
 moved to HttpServer, but the ICAP compatibility part will not go away
 regardless of any cleanup.
 
 The ICAP compatibility issue is a bug in ICAP or specific ICAP servers
 that needs fixing there.

Yes, of course, but fixing those ICAP services is often beyond Squid
admin control, and there is no pressing need to break compatibility with
those ICAP services _now_ (by using a different FTP version than 1.0
or 1.1).


 HTTP itself can already (or shortly will) have
 present HTTP/2.0, h2 or h2-NN message versions to be encapsulated.

I suspect we will receive requests to optionally mask those new versions
when sent over ICAP because some broken service that the admin cannot
fix cannot handle much more than /1.0 and /1.1.


 in src/Server.cc:
 * this change looks semantically wrong:
-if (!doneWithServer()) {
+if (mayReadVirginReplyBody()) {
  closeServer();

 The change itself does not make the code look semantically wrong. On the
 semantics level, the code looks about the same before and after the change:

 Was: If we are not done with the server, close it.
 Now: If we are still reading from the server, close it.
 
 It is now *apparently* skipping all the possible non-reply-reading
 reasons for closing the server connection. That is semantic change...

If you think the above change affects something on the semantics level,
then I do not have any new arguments to convince you otherwise.
Hopefully, the diverging opinions on the code looks are not important in
this case, because we seem to agree that the patch code is correct (at
least in a sense that it does not make the known problem any worse and
actually solves one sub-problem).


 Okay, I think I understand the cases now. So the old HTTP code was
 broken by not considering the request and Trailers cases as okay even
 when ICAP closed. You are just not fixing that bug.

Yes, I am not fixing any old bugs here. As for what exactly is broken
and by how much, I prefer not to discuss those details further (here and
now) because the patch does not change them.


 Would you mind adding a TODO to the new HTTP mayReadVirginReplyBody()
 that it should be only considering reply body reads, not overall server use?

Done. There was no HTTP mayReadVirginReplyBody(), but I added one.


 in src/cache_cf.cc:
 * the new parsePortProtocol() FTP case should return
 Ftp::ProtocolVersion().

Done.


 in src/cf.data.pre:
 * mentions of ftp-track-dirs=on|off need updating

Fixed.


 in src/ftp/Elements.cc:
 * Ftp::HttpReplyWrapper() has unnecessary:
   httpVersion = Http::ProtocolVersion(Ftp::ProtocolVersion().major,
 Ftp::ProtocolVersion().minor);
 
  - the current default Http::ProtocolVersion() constructor sets the
 right values for HTTP messages generated by Squid. The old code using
 Http::ProtocolVersion(1, 1) was incorrect/outdated.
 
  - same for Ftp::Server::parseOneRequest() in src/servers/FtpServer.cc

Left as is because the Http::ProtocolVersion default is irrelevant in
those two places.

We should set the wrapping message version to what FTP code wants to use
for HTTP wrapping. That version is defined by Ftp::ProtocolVersion().
The fact that the HTTP default currently matches the version used for
FTP wrapping is a coincidence that the code should not rely on. Imagine
changing the HTTP default to 2.0 tomorrow and then spending hours trying
to find all the places where FTP was _implicitly_ relying on that
default being 1.1...


 in src/clients/Makefile.am:
 * no need for forward.h in the SOURCES list to be separated from the
 other files.
  - same in src/servers/Makefile.am

Removed.


 * please include $(top_srcdir)/src/TestHeaders.am to run the .h header
 unit tests
  - same in src/ftp/Makefile.am and src/servers/Makefile.am

Done.


 in src/servers/forward.h:
 * missing pre-define for class ConnStateData
  - this would have been caught by the .h unit tests mentioned above.

Fixed.


 in src/servers/FtpServer.cc:
 
 + Ftp::Server::parseOneRequest() for all the following until ++
 
 * please reference RFC 959 section 3.1.1.5.2 for the unusual definition
 of WhiteSpace CharacterSet.

Done. It was actually based on isspace(3) so not that unusual :-)


 * please use existing CharacterSet::LF instead of re-defining as local
 NewLine.

Done.


 * please use tok.prefix(CharacterSet::ALPHA) to parse the FTP command
 instead of BlackSpace. Then explicit delimiter check to validate the input.
  - RFC 959 is clear:
   The command codes themselves are alphabetic characters terminated
by the character SP (Space) if parameters follow and Telnet-EOL
otherwise.

Not changed. I do not see a compelling reason for a general purpose
relay to police traffic by default. Squid itself does not care if there
is some other character present in some command name. Why increase the
chances of breaking 

Re: [PATCH] Native FTP Relay

2014-08-09 Thread Alex Rousskov
On 08/08/2014 11:13 AM, Amos Jeffries wrote:
 On 9/08/2014 4:57 a.m., Alex Rousskov wrote:
 On 08/08/2014 09:48 AM, Amos Jeffries wrote:
 Audit results (part 1):
 * Please apply CharacterSet updates separately.
 * Please apply Tokenizer API updates separately.
 * please apply src/HttpHdrCc.h changes separately.

 Why? If the branch is merged, they will be applied with their own/
 existing commit messages.

 They are updates on previous feature changesets unrelated to FTP which
 will block future parser alterations (also unrelated to FTP) being
 backported if this project takes long to stabilize.
 
 If they have their own commit messages then cherrypicking out of the
 branch should be relatively easy.

Thank you for explaining your rationale. To minimize overheads, let's
come back to this issue if the branch is not merged soon.


 * is FTP/1.1 really the protocol version of FTP parsed by Squid? (the
 AnyP::ProtocolVersion parameters)

 Okay, leave as-is but please document that as the reason for using that
 version number

Done. It was actually documented in the corresponding commit message. To
provide _source code_ documentation in one place, I added
Ftp::ProtocolVersion() and used that everywhere instead of hard-coded
1,1. It will be easier to track down all 1,1 users that way if
somebody wants to change or polish this further later. See ftp/Elements.*

The resulting code required more changes than one may expect and is
still a little awkward because Http::ProtocolVersion should have been
just a function (not a class/type), and because PortCfg::setTransport
internals did not belong to AnyP where they were placed. I did my best
to avoid cascading changes required to fix all that by leaving
Http::ProtocolVersion as is and moving PortCfg::setTransport() internals
into cache_cf.cc where they can access FTP-specific code.


 and a TODO about cleaning up the message processing to
 stop assuming HTTP versions.

Done in one of the many client_side.cc places where we still assume
HTTP. Here is the new TODO:

 /* RFC 2616 section 10.5.6 : handle unsupported HTTP major versions 
 cleanly. */
 /* We currently only support 0.9, 1.0, 1.1 properly */
 /* TODO: move HTTP-specific processing into servers/HttpServer and such */
 if ( (http_ver.major == 0  http_ver.minor != 9) ||
 (http_ver.major  1) ) {


This part of the problem will be gone when HTTP-specific code will be
moved to HttpServer, but the ICAP compatibility part will not go away
regardless of any cleanup.


 in src/cf.data.pre:
 * please document tproxy, name=, protocol=FTP as supported on ftp_port

 We do not know whether it is supported and do not plan on testing/fixing
 that support right now. Do you still want us to document it as supported?

 Yes. You have not changed the TcpAcceptor code for TPROXY lookups or the
 ACL code using port name. So they are supported.

Done. Please note that protocol= requires accel mode, and I am not sure
what that mode means for a native FTP relay. I added protocol=FTP anyway
in hope to shorten the discussion.


 protocol= you explicitly added support in the parser.

This is probably a misunderstanding caused by poor official code
quality. We did not add explicit support for the protocol= option. The
parsing code we added was required for the _default_ ftp_port
configuration (no protocol=... option) to work because of the way the
general port parsing (and other) official code is written.

The official code uses CfgPort::transport for at least two rather
different purposes (for determining foo in foo_port and for the
protocol= feature). Cleaning this up requires a serious effort well
beyond the FTP project scope (the FTP code does not really need the
protocol= feature to work).

If, after reading the above clarification, you would rather see
protocol=FTP gone from squid.conf.documented, I would be more than happy
to remove it.


 in src/FwdState.cc:
 * at line 817 calling request-hier.note() twice in a row with different
 values is both needless change and wasting CPU.

Fixed? This was my merge bug. The fixed patch uses the more recent line
version (and both lines were yours, thank you).


 in src/HttpHeader.cc:
 * please enact the new TODO about stripping FTP-Arguments header on
 PASS command. 

Done to shorten the discussion.


 This information leak could earn us a CVE if merged as-is.

I do not think it could because that code is currently unused by/for FTP
relays (as the added source code comment tried to explain). Native FTP
relay does not send Squid error pages to the user and those error
pages are the only context (that I know of) where the masking code is
actually used today.


 in src/HttpHeaderTools.cc:
 * httpHeaderQuoteString() could be optimized with needInnerQuote as
 SBuf::size_type to append in blocks between characters needing quote.

Agreed. Added a corresponding TODO. I do not want to volunteer to
provide the corresponding optimization at this time and consider it
rather 

Re: [PATCH] Native FTP Relay

2014-08-08 Thread Amos Jeffries
On 8/08/2014 3:29 p.m., Alex Rousskov wrote:
 Hello,
 
 Native FTP Relay support is finally ready: Squid receives native FTP
 commands on ftp_port and relays FTP messages between FTP clients and FTP
 origin servers through the existing Squid access control, adaptation,
 and logging layers. A compressed 70KB patch attached to this email is
 also temporary available at [1]. I would like to merge the corresponding
 lp branch[2] into Squid trunk.
 
 This is a large, complex, experimental feature. I am sure there are
 cases where it does not work well or does not support some existing
 features. I am not aware of any regressions specific to old FTP gateway
 code, and I hope there are no regressions specific to HTTP code, but I
 do not recommend deployment without testing.
 
 The branch code worked quite well in limited production deployment, but
 the final version (with code restructuring and polishing for the
 official submission) has only seen simple lab testing. AFAIK, the FTP
 interception path has not seen any production deployment at all.
 
 This code represents more than a year worth of development, started by
 Dmitry Kurochkin in the beginning of 2013. Christos Tsantilas and I
 finished Dmitry's work. I bear responsibility for the bugs, but there
 are probably areas of Dmitry's code that appear to work well and that
 neither Christos nor I have studied.
 
 Overall, we tried to focus on the new FTP code and leave the old
 FTP/HTTP code alone, except where restructuring was necessary to avoid
 code duplication. For example, the server-facing side reuses a lot of
 the old FTP code, while the relayed FTP commands are passed around in
 HttpMsg structures using the old ClientStreams, doCallout(), and Store
 interfaces.
 
 If you review the patch, please note that bzr is incapable of tracking
 code across file splits (e.g., old ftp.cc becoming ftp/Parsing.cc plus
 three clients/Ftp*.cc files). Many of the old XXXs and TODOs will appear
 as newly added code in the patch. Usually, one can figure it out by
 searching for the similar but deleted code in the patch.
 
 Please see the above referenced lp branch for a detailed change log.
 
 
 Some Native FTP Relay highlights:
 
 * Added ftp_port directive telling Squid to relay native FTP commands.
 * Active and passive FTP support on the user-facing side; require
   passive connections to come from the control connection src IP.
 * IPv6 support (EPSV and, on the user-facing side, EPRT).
 * Intelligent adaptation of relayed FTP FEAT responses.
 * Relaying of multi-line FTP control responses using various formats.
 * Support relaying of FTP MLSD and MLST commands (RFC 3659).
 * Several Microsoft FTP server compatibility features.
 * ICAP/eCAP support (at individual FTP command/response level).
 * Optional current FTP directory tracking (cannot be 100% reliable due
   to symbolic links and such, but is helpful in some common use cases).
 * FTP origin control connection is pinned to the FTP user connection.
 * No caching support -- no reliable Request URIs for that (see above).
 * Significant FTP code restructuring on the server-facing side.
 * Initial steps towards HTTP code restructuring on the client-facing
   side.
 
 
 Changes to the general code used by the Native FTP Relay code:
 
 
 * The user- and origin-facing code restructured as agreed previously on
   this mailing list. I will email some thoughts about the results separately,
   but here is the executive summary:

 src/servers/FtpServer.*  # new FTP server, relaying FTP
 src/servers/HttpServer.* # old ConnStateData parts conflicting with 
 FtpServer
 src/clients/FtpClient.*  # code shared by old and new FTP clients
 src/clients/FtpGateway.* # old FTP client, translating back to HTTP
 src/clients/FtpRelay.*   # new FTP client, relaying FTP
 src/ftp/*# FTP stuff shared by clients and servers


 * Fixed HttpHdr::Private/NoCache(v) implementations and optimized their API.

 * Tokenizer fixes and API improvements:

   Taught Tokenizer to keep track of the number of parsed bytes. Many callers
   need to know that because they need to adjust/consume I/O offsets/buffers.
   
   Adjusted unused Parser::Tokenizer::token() to not treat NUL delimiter
   specially. Besides the fact that not all grammars can treat NUL that way, 
 the
   special NUL treatment resulted in some token() calls returning true for 
 empty
   tokens, which was confusing parsers. Callers that do not require trailing
   delimiters, should use prefix() instead. This change is based on experience
   writing Tokenizer-based FTP parser, although the final parser code uses
   prefix() instead of token(), for unrelated reasons.
   
   Split Parser::Tokenizer::skip(set) into skipOne() and skipAll(). All other
   skip() methods skip exactly one thing (either a given character or a given
   token) but the old skip(set) method was skipping multiple things. That
   confused callers. We now force the caller to make 

Re: [PATCH] Native FTP Relay

2014-08-08 Thread Alex Rousskov
On 08/08/2014 09:48 AM, Amos Jeffries wrote:
 On 8/08/2014 3:29 p.m., Alex Rousskov wrote:
 Hello,

 Native FTP Relay support is finally ready: Squid receives native FTP
 commands on ftp_port and relays FTP messages between FTP clients and FTP
 origin servers through the existing Squid access control, adaptation,
 and logging layers. A compressed 70KB patch attached to this email is
 also temporary available at [1]. I would like to merge the corresponding
 lp branch[2] into Squid trunk.

 This is a large, complex, experimental feature. I am sure there are
 cases where it does not work well or does not support some existing
 features. I am not aware of any regressions specific to old FTP gateway
 code, and I hope there are no regressions specific to HTTP code, but I
 do not recommend deployment without testing.

 The branch code worked quite well in limited production deployment, but
 the final version (with code restructuring and polishing for the
 official submission) has only seen simple lab testing. AFAIK, the FTP
 interception path has not seen any production deployment at all.

 This code represents more than a year worth of development, started by
 Dmitry Kurochkin in the beginning of 2013. Christos Tsantilas and I
 finished Dmitry's work. I bear responsibility for the bugs, but there
 are probably areas of Dmitry's code that appear to work well and that
 neither Christos nor I have studied.

 Overall, we tried to focus on the new FTP code and leave the old
 FTP/HTTP code alone, except where restructuring was necessary to avoid
 code duplication. For example, the server-facing side reuses a lot of
 the old FTP code, while the relayed FTP commands are passed around in
 HttpMsg structures using the old ClientStreams, doCallout(), and Store
 interfaces.

 If you review the patch, please note that bzr is incapable of tracking
 code across file splits (e.g., old ftp.cc becoming ftp/Parsing.cc plus
 three clients/Ftp*.cc files). Many of the old XXXs and TODOs will appear
 as newly added code in the patch. Usually, one can figure it out by
 searching for the similar but deleted code in the patch.

 Please see the above referenced lp branch for a detailed change log.


 Some Native FTP Relay highlights:

 * Added ftp_port directive telling Squid to relay native FTP commands.
 * Active and passive FTP support on the user-facing side; require
   passive connections to come from the control connection src IP.
 * IPv6 support (EPSV and, on the user-facing side, EPRT).
 * Intelligent adaptation of relayed FTP FEAT responses.
 * Relaying of multi-line FTP control responses using various formats.
 * Support relaying of FTP MLSD and MLST commands (RFC 3659).
 * Several Microsoft FTP server compatibility features.
 * ICAP/eCAP support (at individual FTP command/response level).
 * Optional current FTP directory tracking (cannot be 100% reliable due
   to symbolic links and such, but is helpful in some common use cases).
 * FTP origin control connection is pinned to the FTP user connection.
 * No caching support -- no reliable Request URIs for that (see above).
 * Significant FTP code restructuring on the server-facing side.
 * Initial steps towards HTTP code restructuring on the client-facing
   side.


 Changes to the general code used by the Native FTP Relay code:


 * The user- and origin-facing code restructured as agreed previously on
   this mailing list. I will email some thoughts about the results 
 separately,
   but here is the executive summary:

 src/servers/FtpServer.*  # new FTP server, relaying FTP
 src/servers/HttpServer.* # old ConnStateData parts conflicting with 
 FtpServer
 src/clients/FtpClient.*  # code shared by old and new FTP clients
 src/clients/FtpGateway.* # old FTP client, translating back to HTTP
 src/clients/FtpRelay.*   # new FTP client, relaying FTP
 src/ftp/*# FTP stuff shared by clients and servers


 * Fixed HttpHdr::Private/NoCache(v) implementations and optimized their API.

 * Tokenizer fixes and API improvements:

   Taught Tokenizer to keep track of the number of parsed bytes. Many callers
   need to know that because they need to adjust/consume I/O offsets/buffers.
   
   Adjusted unused Parser::Tokenizer::token() to not treat NUL delimiter
   specially. Besides the fact that not all grammars can treat NUL that way, 
 the
   special NUL treatment resulted in some token() calls returning true for 
 empty
   tokens, which was confusing parsers. Callers that do not require trailing
   delimiters, should use prefix() instead. This change is based on 
 experience
   writing Tokenizer-based FTP parser, although the final parser code uses
   prefix() instead of token(), for unrelated reasons.
   
   Split Parser::Tokenizer::skip(set) into skipOne() and skipAll(). All other
   skip() methods skip exactly one thing (either a given character or a given
   token) but the old skip(set) method was skipping multiple things. That
   confused 

Re: [PATCH] Native FTP Relay

2014-08-08 Thread Amos Jeffries
On 9/08/2014 4:57 a.m., Alex Rousskov wrote:
 On 08/08/2014 09:48 AM, Amos Jeffries wrote:
snip

 Audit results (part 1):

 * Please apply CharacterSet updates separately.

 * Please apply Tokenizer API updates separately.

 * please apply src/HttpHdrCc.h changes separately.
 
 Why? If the branch is merged, they will be applied with their own/
 existing commit messages.
 

They are updates on previous feature changesets unrelated to FTP which
will block future parser alterations (also unrelated to FTP) being
backported if this project takes long to stabilize.

If they have their own commit messages then cherrypicking out of the
branch should be relatively easy.

 
 * is FTP/1.1 really the protocol version of FTP parsed by Squid? (the
 AnyP::ProtocolVersion parameters)
 ...
  - NP: 0,0 version should be usable and probably best since there does
 not appear to be any actual official version numbering for FTP.
 
 Since there is no any actual official version numbering for FTP, we
 are using what works best. Version 1.1 works best because it causes
 fewer FTP exceptions in the general code because FTP control connections
 are persistent by default, just like HTTP/1.1 connections. I think there
 is a comment about that.
 
 Using 0.0 would probably create more exceptions. 0.0 will most likely
 also screw up some ICAP services that expect /1.0 or /1.1 when parsing
 headers.
 
 After reading the above arguments, do you still want us to switch to 0.0?
 

Yuck. Messy.

Okay, leave as-is but please document that as the reason for using that
version number and a TODO about cleaning up the message processing to
stop assuming HTTP versions.

 
 in src/cf.data.pre:
 * please document tproxy, name=, protocol=FTP as supported on ftp_port
 
 We do not know whether it is supported and do not plan on testing/fixing
 that support right now. Do you still want us to document it as supported?
 

Yes. You have not changed the TcpAcceptor code for TPROXY lookups or the
ACL code using port name. So they are supported. protocol= you
explicitly added support in the parser.

 
 in src/client_side.h:
 * what is waiting for future HttpsServer home meant to mean on
 postHttpsAccept()
 
 Unlike HTTP and FTP, there is currently no object dedicated to serving
 HTTPS connections. HttpServer is misused for that on one occasion, but a
 lot more work is needed to properly move HTTPS code from ConnStateData
 into severs/HttpsServer.*  That work is unrelated to FTP.

Okay. Thanks.

Amos