I shall do a brief content review because you did take the time to make a patch.
I have explained *at length* why I disagree with your approach. I shall briefly outline those arguemnts once more. I have already stated I do not wish to discuss this futher as you simply will not accept my arguments, this review/summary is provided as a pure courtesy. We have made extensive efforts to streamline the data flow through the browser in the recent past, starting with jmb implementing the low level cache, us making the fetcher interface less troublesome and culminating in the ongoing content handler refactor. I say this as you seem to think this ongoing work is directed at you personally soemhow. Curl ---- We have come to understand that the curl library simply should not be relied upon beyond its http/https implementations. We have implemented fetchers directly for things like file: explicitly. Because the curl libraries in actual common use have numerous bugs and unwanted behaviours (an example would be curl applies proxy settings to file: urls) and the huge memory footprint of curl *in our use case* it looks increasingly like a poor fit. Because of this we have gone to great efforts to *remove* fetcher support from the curl library. As discussed at the develoepr meetings previously Daniel Silverstone is working on creating an http library to eventually completely remove our dependancy on curl. Content ------- As mentioned to you, The way we have chosen for things that generate output (primarily directory listings but the about: handler also) that the browser should display as HTML is that the *fetcher* layer returns the data as generated HTML. We have created helper functions for this generation for directory listings. If the formatted output was not to your liking these content generation routines should have been extended and the formatitng applied using the internal.css. The method of inventing a content type and content handler which then rummages around inside a different content handler is full of problems. Indeed we have worked to completely remove this method of operation and built a data model where it is not supported. Review ------ Will be inline with code thats ok elided, as usual. > > Added files > > > Index: render/gopher.c > =================================================================== > --- /dev/null 2011-05-08 22:46:04.000000000 +0200 > +++ render/gopher.c 2011-05-08 22:41:11.000000000 +0200 > @@ -0,0 +1,753 @@ > +/* > + * Copyright 2006 James Bursa <bu...@users.sourceforge.net> > + * Copyright 2006 Adrian Lees <adri...@users.sourceforge.net> > + * Copyright 2011 François Revol <mmu_...@users.sourceforge.net> odd you credit james and adrian in a new file? > + > +/* HACK HACK */ > +extern const content_handler html_content_handler; right there is why you have got no other reviews by the way, not just because it *is* a hack but because you knew it and did nothing about it, or at least provided a rational explanation. > + > +nserror gopher_create(const content_handler *handler, > + lwc_string *imime_type, const http_parameter *params, > + llcache_handle *llcache, const char *fallback_charset, > + bool quirks, struct content **c); > +bool gopher_convert(struct content *c); > +content_type gopher_content_type(lwc_string *mime_type); > + > +static char *gen_nice_title(const char *path); > +static bool gopher_generate_top(char *buffer, int buffer_length); > +static bool gopher_generate_title(const char *title, char *buffer, int > buffer_length); > +static bool gopher_generate_row(const char **data, size_t *size, > + char *buffer, int buffer_length); > +static bool gopher_generate_bottom(char *buffer, int buffer_length); > + > +static const char gopher_faked_type[] = "text/x-gopher-directory"; > +static lwc_string *gopher_faked_mime_type; as a comment we are trying to move away from forward declaration, secondly every function should be doccommented somewhere. if its static, in its scope here. If its external a comment saying in which header it is docuemnted. > + > +static struct { > + char type; > + const char *mime; > +} gopher_type_map[] = { > + /* these come from http://tools.ietf.org/html/rfc1436 */ > + { '0', "text/plain" }, > + { '1', "text/x-gopher-directory;charset=UTF-8" }, /* gopher > directory */ > + /* 2 CSO search */ > + /* 3 error message */ > + /* 4 binhex encoded text */ > + /* 5 binary archive file */ > + /* 6 uuencoded text */ > + { '7', "text/x-gopher-directory;charset=UTF-8" }, /* search query > */ > + /* 8 telnet: */ > + /* 9 binary */ > + { 'g', "image/gif" }, > + { 'h', "text/html" }, > + /* i information text */ > + /* I image (depends, usually jpeg) */ > + /* s audio (wav?) */ > + /* T tn3270 session */ > + > + /* those are not standardized */ > + { 'd', "application/pdf" }, /* display?? seems to be only for PDF > files so far */ > + { 'p', "image/png"}, /* at least on > gopher://namcub.accelera-labs.com/1/pics */ > + { 0, NULL } > +}; > + arghhhh magic letters! no! an enum if you please > +static const content_handler gopher_content_handler = { > + gopher_create, > + NULL, > + gopher_convert, > + NULL, > + NULL, > + NULL, > + NULL, > + NULL, > + NULL, > + NULL, > + NULL, > + NULL, > + NULL, > + gopher_content_type, > + true > +}; > + yes i know the changes were only applied recently, but named initialisors pls > + > +/** > + * Create a CONTENT_GOPHER. there are six parameters here... > + */ > + > +nserror gopher_create(const content_handler *handler, > + lwc_string *imime_type, const http_parameter *params, > + llcache_handle *llcache, const char *fallback_charset, > + bool quirks, struct content **c) > + > + > +static char *html_escape_string(char *str) > +{ > + char *nice_str, *cnv, *tmp; > + > + if (str == NULL) { > + return NULL; > + } > + > + /* Convert str for display */ > + nice_str = malloc(strlen(str) * SLEN("&") + 1); > + if (nice_str == NULL) { > + return NULL; > + } > + > + /* Escape special HTML characters */ > + for (cnv = nice_str, tmp = str; *tmp != '\0'; tmp++) { > + if (*tmp == '<') { > + *cnv++ = '&'; > + *cnv++ = 'l'; > + *cnv++ = 't'; > + *cnv++ = ';'; > + } else if (*tmp == '>') { > + *cnv++ = '&'; > + *cnv++ = 'g'; > + *cnv++ = 't'; > + *cnv++ = ';'; > + } else if (*tmp == '&') { > + *cnv++ = '&'; > + *cnv++ = 'a'; > + *cnv++ = 'm'; > + *cnv++ = 'p'; > + *cnv++ = ';'; > + } else { > + *cnv++ = *tmp; > + } > + } > + *cnv = '\0'; > + > + return nice_str; > +} BLETCH is there not already a utility function to do this, and if not create one rather than spooging this in and really this ought to be less iffy. > + > + > +static char *gen_nice_title(const char *path) > +{ > + const char *tmp; > + char *nice_path, *cnv; > + char *title; > + int title_length; > + > + /* Convert path for display */ > + nice_path = malloc(strlen(path) * SLEN("&") + 1); > + if (nice_path == NULL) { > + return NULL; > + } > + > + /* Escape special HTML characters */ > + for (cnv = nice_path, tmp = path; *tmp != '\0'; tmp++) { > + if (*tmp == '<') { > + *cnv++ = '&'; > + *cnv++ = 'l'; > + *cnv++ = 't'; > + *cnv++ = ';'; > + } else if (*tmp == '>') { > + *cnv++ = '&'; > + *cnv++ = 'g'; > + *cnv++ = 't'; > + *cnv++ = ';'; > + } else if (*tmp == '&') { > + *cnv++ = '&'; > + *cnv++ = 'a'; > + *cnv++ = 'm'; > + *cnv++ = 'p'; > + *cnv++ = ';'; > + } else { > + *cnv++ = *tmp; > + } > + } > + *cnv = '\0'; didnt you just have this code above? > + > +bool gopher_need_generate(char type) > +{ > + switch (type) { > + case '1': > + case '7': > + return true; > + default: > + return false; > + } > +} > + magic letters again > + > +/** > + * Generates the top part of an HTML directory listing page > + * > + * \return true iff buffer filled without error > + * > + * This is part of a series of functions. To generate a complete page, > + * call the following functions in order: > + * > + * gopher_generate_top() > + * gopher_generate_title() > + * gopher_generate_row() -- call 'n' times for 'n' rows > + * gopher_generate_bottom() docuemnt this in the caller, not in every function > + */ > + > +static bool gopher_generate_top(char *buffer, int buffer_length) > +{ > + int error = snprintf(buffer, buffer_length, > + "<html>\n" > + "<head>\n" > + /*"<!-- base href=\"%s\" -->\n"*//* XXX: needs the > content url */ > + /* Don't do that: > + * seems to trigger a reparsing of the gopher data > itself as html... > + * "<meta http-equiv=\"Content-Type\" > content=\"text/html; charset=UTF-8\" />\n" > + */ yuk > + /* TODO: move this to clean CSS in internal.css */ well why have you not? > + "<link rel=\"stylesheet\" title=\"Standard\" " > + "type=\"text/css\" > href=\"resource:internal.css\">\n"); > + > + if (error < 0 || error >= buffer_length) > + /* Error or buffer too small */ > + return false; > + else > + /* OK */ > + return true; > +} > + > + > + > +/** > + * Internal worker called by gopher_generate_row(). > + */ > + > +static bool gopher_generate_row_internal(char type, char *fields[5], whats this magic number? not an enum so fileds[] has meaning? > + char *buffer, int buffer_length) > +{ > + char *nice_text; > + char *redirect_url = NULL; > + int error; > + bool alt_port = false; > + char *username = NULL; > + > + if (fields[3] && strcmp(fields[3], "70")) > + alt_port = true; > + > + /* escape html special characters */ > + nice_text = html_escape_string(fields[0]); > + > + /* XXX: outputting \n generates better looking html code, > + * but currently screws up indentation due to a bug. what bug? why isnt it fixed? > + */ > +#define HTML_LF > +/*#define HTML_LF "\n"*/ > + yuk > + switch (type) { > + case '.': > + /* end of the page */ > + *buffer = '\0'; > + break; > + case '0': /* text/plain link */ > + error = snprintf(buffer, buffer_length, > + "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF > + "<span class=\"text\">%s</span></a>"HTML_LF > + "<br/>"HTML_LF, > + fields[2], > + alt_port ? ":" : "", > + alt_port ? fields[3] : "", > + type, fields[1], nice_text); way too much magic in one printf. > + break; > + case '9': /* binary */ > + error = snprintf(buffer, buffer_length, > + "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF > + "<span class=\"binary\">%s</span></a>"HTML_LF > + "<br/>"HTML_LF, > + fields[2], > + alt_port ? ":" : "", > + alt_port ? fields[3] : "", > + type, fields[1], nice_text); > + break; ditto > + case '1': > + /* > + * directory link > + */ > + error = snprintf(buffer, buffer_length, > + "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF > + "<span class=\"dir\">%s</span></a>"HTML_LF > + "<br/>"HTML_LF, > + fields[2], > + alt_port ? ":" : "", > + alt_port ? fields[3] : "", > + type, fields[1], nice_text); > + break; and again > + case '3': > + /* Error > + */ > + error = snprintf(buffer, buffer_length, > + "<span class=\"error\">%s</span><br/>"HTML_LF, > + nice_text); > + break; > + case '7': > + /* TODO: handle search better. > + * For now we use an unnamed input field and accept sending > ?=foo > + * as it seems at least Veronica-2 ignores the = but it's > unclean. > + */ handle it or a better comment > + error = snprintf(buffer, buffer_length, > + "<form method=\"get\" > action=\"gopher://%s%s%s/%c%s\">"HTML_LF > + "<span class=\"query\">" > + "<label>%s " > + "<input name=\"\" type=\"text\" align=\"right\" > />" > + "</label>" > + "</span></form>"HTML_LF > + "<br/>"HTML_LF, > + fields[2], > + alt_port ? ":" : "", > + alt_port ? fields[3] : "", > + type, fields[1], nice_text); seriously? what? someome else might thave to read this stuff, please, less and more readable > + break; > + case '8': > + /* telnet: links > + * cf. gopher://78.80.30.202/1/ps3 > + * -> gopher://78.80.30.202:23/8/ps3/new -> new@78.80.30.202 > + */ > + alt_port = false; > + if (fields[3] && strcmp(fields[3], "23")) > + alt_port = true; > + username = strrchr(fields[1], '/'); > + if (username) > + username++; > + error = snprintf(buffer, buffer_length, > + "<a href=\"telnet://%s%s%s%s%s\">"HTML_LF > + "<span class=\"dir\">%s</span></a>"HTML_LF > + "<br/>"HTML_LF, > + username ? username : "", > + username ? "@" : "", > + fields[2], > + alt_port ? ":" : "", > + alt_port ? fields[3] : "", > + nice_text); > + break; ditto > + case 'g': > + case 'I': > + case 'p': > + /* quite dangerous, cf. gopher://namcub.accela-labs.com/1/pics > */ dangerous? explain! > + if (option_gopher_inline_images) { > + error = snprintf(buffer, buffer_length, > + "<a > href=\"gopher://%s%s%s/%c%s\">"HTML_LF > + "<span class=\"img\">%s "HTML_LF /* > </span><br/> */ > + //"<span class=\"img\" >"HTML_LF > + "<img src=\"gopher://%s%s%s/%c%s\" > alt=\"%s\"/>"HTML_LF > + "</span>" > + "</a>" > + "<br/>"HTML_LF, > + fields[2], > + alt_port ? ":" : "", > + alt_port ? fields[3] : "", > + type, fields[1], > + nice_text, > + fields[2], > + alt_port ? ":" : "", > + alt_port ? fields[3] : "", > + type, fields[1], > + nice_text); > + break; > + } complex > + /* fallback to default, link them */ > + error = snprintf(buffer, buffer_length, > + "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF > + "<span class=\"dir\">%s</span></a>"HTML_LF > + "<br/>"HTML_LF, > + fields[2], > + alt_port ? ":" : "", > + alt_port ? fields[3] : "", > + type, fields[1], nice_text); > + break; > + case 'h': > + if (fields[1] && strncmp(fields[1], "URL:", 4) == 0) SLEN() > + redirect_url = fields[1] + 4; > + /* cf. gopher://pineapple.vg/1 */ > + if (fields[1] && strncmp(fields[1], "/URL:", 5) == 0) > + redirect_url = fields[1] + 5; > + if (redirect_url) { > + error = snprintf(buffer, buffer_length, > + "<a href=\"%s\">"HTML_LF > + "<span > class=\"link\">%s</span></a>"HTML_LF > + "<br/>"HTML_LF, > + redirect_url, > + nice_text); > + } else { > + /* cf. gopher://sdf.org/1/sdf/classes/ */ > + error = snprintf(buffer, buffer_length, > + "<a > href=\"gopher://%s%s%s/%c%s\">"HTML_LF > + "<span > class=\"dir\">%s</span></a>"HTML_LF > + "<br/>"HTML_LF, > + fields[2], > + alt_port ? ":" : "", > + alt_port ? fields[3] : "", > + type, fields[1], nice_text); > + } > + break; > + case 'i': > + error = snprintf(buffer, buffer_length, > + "<span class=\"info\">%s</span><br/>"HTML_LF, > + nice_text); > + break; > + default: > + LOG(("warning: unknown gopher item type 0x%02x '%c'\n", type, > type)); > + error = snprintf(buffer, buffer_length, > + "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF > + "<span class=\"dir\">%s</span></a>"HTML_LF > + "<br/>"HTML_LF, > + fields[2], > + alt_port ? ":" : "", > + alt_port ? fields[3] : "", > + type, fields[1], nice_text); > + break; > + } > + > + free(nice_text); > + > + if (error < 0 || error >= buffer_length) > + /* Error or buffer too small */ > + return false; > + else > + /* OK */ > + return true; > +} overall this function was too long, too complex, too many "magic" value > + > +static bool gopher_generate_row(const char **data, size_t *size, > + char *buffer, int buffer_length) > +{ > + bool ok = false; > + char type = 0; > + int field = 0; > + /* name, selector, host, port, gopher+ flag */ > + char *fields[5] = { NULL, NULL, NULL, NULL, NULL }; magic value > + const char *s = *data; > + const char *p = *data; > + int i; > + > + for (; *size && *p; p++, (*size)--) { > + if (!type) { > + type = *p; > + if (!type || type == '\n' || type == '\r') { if type is not what? our idom is if (type == NULL) > + LOG(("warning: invalid gopher item type > 0x%02x\n", type)); > + } > + s++; > + continue; > + } > + switch (*p) { > + case '\n': > + if (field > 0) { > + LOG(("warning: unterminated gopher item > '%c'\n", type)); > + } > + //FALLTHROUGH no c++ comemnts please > + case '\r': > + if (*size < 1 || p[1] != '\n') { > + LOG(("warning: CR without LF in gopher item > '%c'\n", type)); > + } > + if (field < 3 && type != '.') { > + LOG(("warning: unterminated gopher item > '%c'\n", type)); > + } > + fields[field] = malloc(p - s + 1); > + memcpy(fields[field], s, p - s); > + fields[field][p - s] = '\0'; > + if (type == '.' && field == 0 && p == s) { > + ;/* XXX: signal end of page? For now we just > ignore it. */ > + } is that supposed to be a todo? > + ok = gopher_generate_row_internal(type, fields, buffer, > buffer_length); > + for (i = 0; i < 5; i++) { magic numbers and please can you try and make variable names meaningful, it is my personal pet peeve that we are writing C not fortran but good variable names are important. > + free(fields[i]); > + fields[i] = NULL; > + } > + (*size)--; > + p++; > + if (*size && *p == '\n') { > + p++; > + (*size)--; > + } > + *data = p; > + field = 0; > + return ok; > + case '\x09': > + if (field >= 4) { > + LOG(("warning: extra tab in gopher item > '%c'\n", type)); > + break; > + } > + fields[field] = malloc(p - s + 1); > + memcpy(fields[field], s, p - s); > + fields[field][p - s] = '\0'; not strndup? > Index: render/gopher.h > =================================================================== > --- /dev/null 2011-05-08 22:46:04.000000000 +0200 > +++ render/gopher.h 2011-05-08 22:41:11.000000000 +0200 > + > +#ifndef _NETSURF_RENDER_GOPHER_H_ > +#define _NETSURF_RENDER_GOPHER_H_ it has recently been pointe dout to us that macros beginning _ belong to the compiler so we ought not to use them plain NETSURF_RENDER_GOPHER_H would be ok > + > +#include <stddef.h> > + > +struct content; > +struct http_parameter; > + > +nserror gopher_init(void); > +void gopher_fini(void); > + > +const char *gopher_type_to_mime(char type); > +bool gopher_need_generate(char type); docuemntation? > > > Index: render/html.c > =================================================================== > --- render/html.c (revision 12321) > +++ render/html.c (working copy) > @@ -101,7 +101,7 @@ > unsigned int depth); > #endif > > -static const content_handler html_content_handler = { > +const content_handler html_content_handler = { very rude word > html_create, > html_process_data, > html_convert, > Index: utils/url.c > =================================================================== > --- utils/url.c (revision 12321) > +++ utils/url.c (working copy) > @@ -856,6 +856,38 @@ > } > > /** > + * Extract the gopher document type from an URL > + * > + * \param url an absolute URL > + * \param result pointer to buffer to hold result > + * \return URL_FUNC_OK on success > + */ > + > +url_func_result url_gopher_type(const char *url, char *result) > +{ > + url_func_result status; > + struct url_components components; > + > + assert(url); > + > + status = url_get_components(url, &components); > + if (status == URL_FUNC_OK) { > + if (!components.path) { > + status = URL_FUNC_FAILED; > + } else { > + if (strlen(components.path) < 2) > + *result = '1'; > + else if (components.path[0] == '/') > + *result = components.path[1]; > + else > + status = URL_FUNC_FAILED; > + } > + } > + url_destroy_components(&components); > + return status; > +} is this useful in the generic url handling? > Index: desktop/options.c > =================================================================== > --- desktop/options.c (revision 12321) > +++ desktop/options.c (working copy) > @@ -69,6 +69,8 @@ > char *option_http_proxy_auth_user = 0; > /** Proxy authentication password */ > char *option_http_proxy_auth_pass = 0; > +/** Inline image items in Gopher pages. Dangerous. */ > +bool option_gopher_inline_images = false; > /** Default font size / 0.1pt. */ > int option_font_size = 128; > /** Minimum font size. */ > @@ -248,6 +250,7 @@ > { "http_proxy_auth", OPTION_INTEGER, &option_http_proxy_auth }, > { "http_proxy_auth_user", OPTION_STRING, &option_http_proxy_auth_user }, > { "http_proxy_auth_pass", OPTION_STRING, &option_http_proxy_auth_pass }, > + { "gopher_inline_images", OPTION_BOOL, &option_gopher_inline_images }, > { "font_size", OPTION_INTEGER, &option_font_size }, > { "font_min_size", OPTION_INTEGER, &option_font_min_size }, > { "font_sans", OPTION_STRING, &option_font_sans }, would this not have been better as an option to extend the generic directory listing to show images inline? > Index: content/fetchers/curl.c > =================================================================== > --- content/fetchers/curl.c (revision 12321) > +++ content/fetchers/curl.c (working copy) various eww in teh curl handler...it really is supposed to be only for http(s) > > +void * fetch_curl_setup_gopher(struct fetch *parent_fetch, const char *url, > + bool only_2xx, const char *post_urlenc, > + const struct fetch_multipart_data *post_multipart, > + const char **headers) > +{ > + struct curl_fetch_info *f; > + const char *mime; > + char type; > + f = fetch_curl_setup(parent_fetch, url, only_2xx, post_urlenc, > + post_multipart, headers); > + if (url_gopher_type(url, &type) == URL_FUNC_OK && type) { > + f->gopher_type = type; > + } else { > + f->http_code = 404; > + fetch_set_http_code(f->fetch_handle, f->http_code); > + } > + > + mime = gopher_type_to_mime(type); > + /* TODO: add a fetch_mimetype_by_ext() or fetch_mimetype_sniff_data() */ err we already have a way of doing this for file: perhaps you can use that > + /* > + if (mime == NULL) > + mime = "application/octet-stream"; > + */ > + ummm? > @@ -771,6 +820,8 @@ > LOG(("done %s", f->url)); > > if (abort_fetch == false && result == CURLE_OK) { > + //if (f->gopher_type) > + //fetch_curl_gopher_data(NULL, 0, 0, f); > /* fetch completed normally */ > if (f->stopped || > (!f->had_headers && this is C not C++, apropriate comment syntax please and if its not useful *remove* it > @@ -978,6 +1029,23 @@ > struct curl_fetch_info *f = _f; > CURLcode code; > > + /* gopher data receives special treatment */ > + if (f->gopher_type && gopher_need_generate(f->gopher_type)) { > + /* type 3 items report an error */ > + if (!f->http_code) { > + if (data[0] == '3') { magic *again* blergh > + /* TODO: try to guess better from the string ? > + * like "3 '/bcd' doesn't exist!" > + * TODO: what about other file types ? > + */ > + f->http_code = 404; hmm do we not have an enum for there? bad project, no biccuit > + } else { > + f->http_code = 200; > + } > + fetch_set_http_code(f->fetch_handle, f->http_code); > + } > + } > + > /* ensure we only have to get this information once */ > if (!f->http_code) > { Even leaving aside the structural issues, this needs a *lot* of work before its merged. -- Regards Vincent