Re: http anchor fix [Was: Re: (2) HTTP basic authentication for ftp(1) (RFC 2617)]

2012-08-12 Thread Eric P. Mangold
On Sat, Aug 11, 2012 at 09:47:23PM +0200, Christiano F. Haesbaert wrote:
 On Tue, Aug 07, 2012 at 11:50:26AM -0400, Eric P. Mangold wrote:
   [...]
 
 I would prefer this to be done on the path processing block, if
 possible. Just make sure you test the scheme for http/https, sorry for
 slacking on the revision. 
 
 Could you give it a try ?

Sure I'll try to have a look at it soon  - thanks for taking the time to look 
at this :)

Regards,
Eric P. Mangold



Re: http anchor fix [Was: Re: (2) HTTP basic authentication for ftp(1) (RFC 2617)]

2012-08-11 Thread Christiano F. Haesbaert
On Tue, Aug 07, 2012 at 11:50:26AM -0400, Eric P. Mangold wrote:
 Sorry for the top-post
 
 I was wondering if some of you (active) 'ftp' maintainers could evaluate my 
 patch below.
 
 I posted this to the list ages ago, and tried following up with some former 
 committers on the ftp code, but for whatever reason I never received any 
 communication at all.
 
 It is a very simple patch, if someone could review it, and let me know if 
 this is OK or not.
 
 I would apprciate it.
 
 original message follows...
 
 Hi,
 
 I was trying to use the 'ftp' program to retrieve the following URL:
 
 https://pypi.python.org/packages/source/p/pip/pip-1.0.2.tar.gz#md5=47ec6ff3f6d962696fe08d4c8264ad49
 
 Which fails because it considers the fragment as part of the path, and
 so the server returns 404. These type of links are quite common these
 days and so it would be nice if 'ftp' would handle them.
 
 This is my first patch to OpenBSD, so please let me know if there is
 anyhthing else I can do to get this feature implemented. The patch is
 very simple, and I think, correct, as '#' is an unsafe character, and
 must always be URL-encoded if it is to appear literally in a URL and
 not be interpreted as the start of a fragment identifier.
 

I would prefer this to be done on the path processing block, if
possible. Just make sure you test the scheme for http/https, sorry for
slacking on the revision. 

Could you give it a try ?

if (isfileurl) {
path = host;
} else {
path = strchr(host, '/');   /* Find path */
if (EMPTYSTRING(path)) {
if (outfile) {  /* No slash, but */
path=strchr(host,'\0'); /* we have
outfile. */
goto noslash;
}
if (isftpurl)
goto noftpautologin;
warnx(No `/' after host (use -o): %s, origline);
goto cleanup_url_get;
}
*path++ = '\0';

 probably here

if (EMPTYSTRING(path)  !outfile) {
if (isftpurl)
goto noftpautologin;
warnx(No filename after host (use -o): %s, origline);
goto cleanup_url_get;
}

}







 Regards,
 Eric P. Mangold
 
 Index: fetch.c
 ===
 RCS file: /cvs/src/usr.bin/ftp/fetch.c,v
 retrieving revision 1.103
 diff -u -r1.103 fetch.c
 --- fetch.c 25 Aug 2010 20:32:37 -  1.103
 +++ fetch.c 29 Oct 2011 03:34:02 -
 @@ -205,6 +205,11 @@
 if (newline == NULL)
 errx(1, Can't allocate memory to parse URL);
 if (strncasecmp(newline, HTTP_URL, sizeof(HTTP_URL) - 1) == 0) {
 +/* Remove any trailing fragment identifier from the HTTP URL.
 + * Fragments (HTTP anchors) are identified by a hash char 
 ('#'),
 + * as per RFC 3986. */
 +newline = strsep(newline, #);
 +
 host = newline + sizeof(HTTP_URL) - 1;
  #ifndef SMALL
 scheme = HTTP_URL;
 @@ -221,6 +226,11 @@
  #ifndef SMALL
 scheme = FILE_URL;
 } else if (strncasecmp(newline, HTTPS_URL, sizeof(HTTPS_URL) - 1) == 
 0) {
 +/* Remove any trailing fragment identifier from the HTTPS 
 URL.
 + * Fragments (HTTP anchors) are identified by a hash char 
 ('#'),
 + * as per RFC 3986. */
 +newline = strsep(newline, #);
 +
 host = newline + sizeof(HTTPS_URL) - 1;
 ishttpsurl = 1;
 scheme = HTTPS_URL;
 
 
 On Mon, Aug 06, 2012 at 10:56:28PM +0200, Christiano F. Haesbaert wrote:
  Please ignore the other thread, it takes ages for me to open my sent box
  over gprs, so I'm opening a new one.
  
  
  Index: fetch.c
  ===
  RCS file: /cvs/src/usr.bin/ftp/fetch.c,v
  retrieving revision 1.105
  diff -d -u -p -r1.105 fetch.c
  --- fetch.c 30 Apr 2012 13:41:26 -  1.105
  +++ fetch.c 6 Aug 2012 20:49:51 -
  @@ -177,7 +177,7 @@ url_get(const char *origline, const char
   {
  char pbuf[NI_MAXSERV], hbuf[NI_MAXHOST], *cp, *portnum, *path, ststr[4];
  char *hosttail, *cause = unknown, *newline, *host, *port, *buf = NULL;
  -   char *epath, *redirurl, *loctail;
  +   char *epath, *redirurl, *loctail, *h, *p;
  int error, i, isftpurl = 0, isfileurl = 0, isredirect = 0, rval = -1;
  struct addrinfo hints, *res0, *res, *ares = NULL;
  const char * volatile savefile;
  @@ -191,7 +191,7 @@ url_get(const char *origline, const char
  size_t len, wlen;
   #ifndef SMALL
  char *sslpath = NULL, *sslhost = NULL;
  -   char *locbase,