Re: [PATCH 2/2] git_connect: factor out discovery of the protocol and its parts
On 2013-11-05 22.22, Johannes Sixt wrote: Am 05.11.2013 21:45, schrieb Torsten Bögershausen: On 2013-11-05 20.39, Johannes Sixt wrote: Thanks for picking this up, please see some minor nits inline, and git_connect() is at the end -struct child_process *git_connect(int fd[2], const char *url_orig, - const char *prog, int flags) +static enum protocol parse_connect_url(const char *url_orig, char **ret_host, + char **ret_port, char **ret_path) { char *url; char *host, *path; char *end; Can we put all the char * into one single line? The idea here was to keep the diff minimal, and that further slight cleanups should be combined with subsequent rewrites that should happen to this function. int c; @@ -645,6 +628,49 @@ struct child_process *git_connect(int fd[2], const char *url_orig, if (protocol == PROTO_SSH host != url) port = get_port(end); + *ret_host = xstrdup(host); + if (port) + *ret_port = xstrdup(port); + else + *ret_port = NULL; + if (free_path) + *ret_path = path; + else + *ret_path = xstrdup(path); + free(url); + return protocol; +} + +static struct child_process no_fork; + +/* + * This returns a dummy child_process if the transport protocol does not + * need fork(2), or a struct child_process object if it does. Once done, + * finish the connection with finish_connect() with the value returned from + * this function (it is safe to call finish_connect() with NULL to support + * the former case). + * + * If it returns, the connect is successful; it just dies on errors (this + * will hopefully be changed in a libification effort, to return NULL when + * the connection failed). + */ +struct child_process *git_connect(int fd[2], const char *url, + const char *prog, int flags) +{ + char *host, *path; + struct child_process *conn = no_fork; + enum protocol protocol; + char *port; + const char **arg; + struct strbuf cmd = STRBUF_INIT; + + /* Without this we cannot rely on waitpid() to tell +* what happened to our children. +*/ + signal(SIGCHLD, SIG_DFL); + + protocol = parse_connect_url(url, host, port, path); + if (protocol == PROTO_GIT) { /* These underlying connection commands die() if they * cannot connect. @@ -666,9 +692,9 @@ struct child_process *git_connect(int fd[2], const char *url_orig, prog, path, 0, target_host, 0); free(target_host); This is hard to see in the diff, I think we don't need target_host any more. I though that as well first, but no, we still need it. Further rewrites are needed that move the port discovery from git_proxy_connect() and git_tcp_connect() to the new parse_connect_url() before target_host can go away. And even then it is questionable because target_host is used in an error message and is intended to reflect the original combined host+port portion of the URL, if I read the code correctly. - free(url); - if (free_path) - free(path); + free(host); + free(port); + free(path); return conn; } @@ -709,9 +735,9 @@ struct child_process *git_connect(int fd[2], const char *url_orig, fd[0] = conn-out; /* read from child's stdout */ fd[1] = conn-in; /* write to child's stdin */ strbuf_release(cmd); - free(url); - if (free_path) - free(path); This end of function, free everything and return conn, could we re-arange so that it is in the code only once ? That would be quite simple now; just place the part after the first return into the else branch. That opens opportunities to move variable declarations from the top of the function into the else branch. But all of these changes should go into a separate commit, IMO, so that the function splitting that happens here can be verified more easily. -- Hannes Agreed on all points, (some re-reading was needed) I will first focus on the test cases, since having god test cases eases us the re-factoring later on. Thanks /Torsten -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] git_connect: factor out discovery of the protocol and its parts
git_connect has grown large due to the many different protocols syntaxes that are supported. Move the part of the function that parses the URL to connect to into a separate function for readability. Signed-off-by: Johannes Sixt j...@kdbg.org --- Apart from this simplification, the protocol parsing code is a complete mystery to me. There is more potential for simplification because git_proxy_connect() and git_tcp_connect() do their own host+port parsing. connect.c | 80 ++- 1 file changed, 53 insertions(+), 27 deletions(-) diff --git a/connect.c b/connect.c index 6cc1f8d..a6cf345 100644 --- a/connect.c +++ b/connect.c @@ -543,37 +543,20 @@ static char *get_port(char *host) return NULL; } -static struct child_process no_fork; - /* - * This returns a dummy child_process if the transport protocol does not - * need fork(2), or a struct child_process object if it does. Once done, - * finish the connection with finish_connect() with the value returned from - * this function (it is safe to call finish_connect() with NULL to support - * the former case). - * - * If it returns, the connect is successful; it just dies on errors (this - * will hopefully be changed in a libification effort, to return NULL when - * the connection failed). + * Extract protocol and relevant parts from the specified connection URL. + * The caller must free() the returned strings. */ -struct child_process *git_connect(int fd[2], const char *url_orig, - const char *prog, int flags) +static enum protocol parse_connect_url(const char *url_orig, char **ret_host, + char **ret_port, char **ret_path) { char *url; char *host, *path; char *end; int c; - struct child_process *conn = no_fork; enum protocol protocol = PROTO_LOCAL; int free_path = 0; char *port = NULL; - const char **arg; - struct strbuf cmd = STRBUF_INIT; - - /* Without this we cannot rely on waitpid() to tell -* what happened to our children. -*/ - signal(SIGCHLD, SIG_DFL); if (is_url(url_orig)) url = url_decode(url_orig); @@ -645,6 +628,49 @@ struct child_process *git_connect(int fd[2], const char *url_orig, if (protocol == PROTO_SSH host != url) port = get_port(end); + *ret_host = xstrdup(host); + if (port) + *ret_port = xstrdup(port); + else + *ret_port = NULL; + if (free_path) + *ret_path = path; + else + *ret_path = xstrdup(path); + free(url); + return protocol; +} + +static struct child_process no_fork; + +/* + * This returns a dummy child_process if the transport protocol does not + * need fork(2), or a struct child_process object if it does. Once done, + * finish the connection with finish_connect() with the value returned from + * this function (it is safe to call finish_connect() with NULL to support + * the former case). + * + * If it returns, the connect is successful; it just dies on errors (this + * will hopefully be changed in a libification effort, to return NULL when + * the connection failed). + */ +struct child_process *git_connect(int fd[2], const char *url, + const char *prog, int flags) +{ + char *host, *path; + struct child_process *conn = no_fork; + enum protocol protocol; + char *port; + const char **arg; + struct strbuf cmd = STRBUF_INIT; + + /* Without this we cannot rely on waitpid() to tell +* what happened to our children. +*/ + signal(SIGCHLD, SIG_DFL); + + protocol = parse_connect_url(url, host, port, path); + if (protocol == PROTO_GIT) { /* These underlying connection commands die() if they * cannot connect. @@ -666,9 +692,9 @@ struct child_process *git_connect(int fd[2], const char *url_orig, prog, path, 0, target_host, 0); free(target_host); - free(url); - if (free_path) - free(path); + free(host); + free(port); + free(path); return conn; } @@ -709,9 +735,9 @@ struct child_process *git_connect(int fd[2], const char *url_orig, fd[0] = conn-out; /* read from child's stdout */ fd[1] = conn-in; /* write to child's stdin */ strbuf_release(cmd); - free(url); - if (free_path) - free(path); + free(host); + free(port); + free(path); return conn; } -- 1.8.4.33.gd68f7e8 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] git_connect: factor out discovery of the protocol and its parts
On 2013-11-05 20.39, Johannes Sixt wrote: Thanks for picking this up, please see some minor nits inline, and git_connect() is at the end -struct child_process *git_connect(int fd[2], const char *url_orig, - const char *prog, int flags) +static enum protocol parse_connect_url(const char *url_orig, char **ret_host, +char **ret_port, char **ret_path) { char *url; char *host, *path; char *end; Can we put all the char * into one single line? int c; @@ -645,6 +628,49 @@ struct child_process *git_connect(int fd[2], const char *url_orig, if (protocol == PROTO_SSH host != url) port = get_port(end); + *ret_host = xstrdup(host); + if (port) + *ret_port = xstrdup(port); + else + *ret_port = NULL; + if (free_path) + *ret_path = path; + else + *ret_path = xstrdup(path); + free(url); + return protocol; +} + +static struct child_process no_fork; + +/* + * This returns a dummy child_process if the transport protocol does not + * need fork(2), or a struct child_process object if it does. Once done, + * finish the connection with finish_connect() with the value returned from + * this function (it is safe to call finish_connect() with NULL to support + * the former case). + * + * If it returns, the connect is successful; it just dies on errors (this + * will hopefully be changed in a libification effort, to return NULL when + * the connection failed). + */ +struct child_process *git_connect(int fd[2], const char *url, + const char *prog, int flags) +{ + char *host, *path; + struct child_process *conn = no_fork; + enum protocol protocol; + char *port; + const char **arg; + struct strbuf cmd = STRBUF_INIT; + + /* Without this we cannot rely on waitpid() to tell + * what happened to our children. + */ + signal(SIGCHLD, SIG_DFL); + + protocol = parse_connect_url(url, host, port, path); + if (protocol == PROTO_GIT) { /* These underlying connection commands die() if they * cannot connect. @@ -666,9 +692,9 @@ struct child_process *git_connect(int fd[2], const char *url_orig, prog, path, 0, target_host, 0); free(target_host); This is hard to see in the diff, I think we don't need target_host any more. - free(url); - if (free_path) - free(path); + free(host); + free(port); + free(path); return conn; } @@ -709,9 +735,9 @@ struct child_process *git_connect(int fd[2], const char *url_orig, fd[0] = conn-out; /* read from child's stdout */ fd[1] = conn-in; /* write to child's stdin */ strbuf_release(cmd); - free(url); - if (free_path) - free(path); This end of function, free everything and return conn, could we re-arange so that it is in the code only once ? + free(host); + free(port); + free(path); return conn; } struct child_process *git_connect(int fd[2], const char *url, const char *prog, int flags) { char *host, *port, *path; struct child_process *conn = no_fork; enum protocol protocol; const char **arg; struct strbuf cmd = STRBUF_INIT; /* Without this we cannot rely on waitpid() to tell * what happened to our children. */ signal(SIGCHLD, SIG_DFL); protocol = parse_connect_url(url, host, port, path); if (protocol == PROTO_GIT) { /* These underlying connection commands die() if they * cannot connect. */ if (git_use_proxy(host)) conn = git_proxy_connect(fd, host); else git_tcp_connect(fd, host, flags); /* * Separate original protocol components prog and path * from extended host header with a NUL byte. * * Note: Do not add any other headers here! Doing so * will cause older git-daemon servers to crash. */ packet_write(fd[1], %s %s%chost=%s%c, prog, path, 0, host, 0); } else { conn = xcalloc(1, sizeof(*conn)); strbuf_addstr(cmd, prog); strbuf_addch(cmd, ' '); sq_quote_buf(cmd, path); conn-in = conn-out = -1; conn-argv = arg = xcalloc(7, sizeof(*arg)); if (protocol == PROTO_SSH) { const char *ssh = getenv(GIT_SSH); int putty = ssh strcasestr(ssh, plink); if (!ssh) ssh = ssh; *arg++ = ssh; if (putty !strcasestr(ssh, tortoiseplink)) *arg++ = -batch; if (port) { /* P is for PuTTY, p
Re: [PATCH 2/2] git_connect: factor out discovery of the protocol and its parts
Am 05.11.2013 21:45, schrieb Torsten Bögershausen: On 2013-11-05 20.39, Johannes Sixt wrote: Thanks for picking this up, please see some minor nits inline, and git_connect() is at the end -struct child_process *git_connect(int fd[2], const char *url_orig, - const char *prog, int flags) +static enum protocol parse_connect_url(const char *url_orig, char **ret_host, + char **ret_port, char **ret_path) { char *url; char *host, *path; char *end; Can we put all the char * into one single line? The idea here was to keep the diff minimal, and that further slight cleanups should be combined with subsequent rewrites that should happen to this function. int c; @@ -645,6 +628,49 @@ struct child_process *git_connect(int fd[2], const char *url_orig, if (protocol == PROTO_SSH host != url) port = get_port(end); +*ret_host = xstrdup(host); +if (port) +*ret_port = xstrdup(port); +else +*ret_port = NULL; +if (free_path) +*ret_path = path; +else +*ret_path = xstrdup(path); +free(url); +return protocol; +} + +static struct child_process no_fork; + +/* + * This returns a dummy child_process if the transport protocol does not + * need fork(2), or a struct child_process object if it does. Once done, + * finish the connection with finish_connect() with the value returned from + * this function (it is safe to call finish_connect() with NULL to support + * the former case). + * + * If it returns, the connect is successful; it just dies on errors (this + * will hopefully be changed in a libification effort, to return NULL when + * the connection failed). + */ +struct child_process *git_connect(int fd[2], const char *url, + const char *prog, int flags) +{ +char *host, *path; +struct child_process *conn = no_fork; +enum protocol protocol; +char *port; +const char **arg; +struct strbuf cmd = STRBUF_INIT; + +/* Without this we cannot rely on waitpid() to tell + * what happened to our children. + */ +signal(SIGCHLD, SIG_DFL); + +protocol = parse_connect_url(url, host, port, path); + if (protocol == PROTO_GIT) { /* These underlying connection commands die() if they * cannot connect. @@ -666,9 +692,9 @@ struct child_process *git_connect(int fd[2], const char *url_orig, prog, path, 0, target_host, 0); free(target_host); This is hard to see in the diff, I think we don't need target_host any more. I though that as well first, but no, we still need it. Further rewrites are needed that move the port discovery from git_proxy_connect() and git_tcp_connect() to the new parse_connect_url() before target_host can go away. And even then it is questionable because target_host is used in an error message and is intended to reflect the original combined host+port portion of the URL, if I read the code correctly. -free(url); -if (free_path) -free(path); +free(host); +free(port); +free(path); return conn; } @@ -709,9 +735,9 @@ struct child_process *git_connect(int fd[2], const char *url_orig, fd[0] = conn-out; /* read from child's stdout */ fd[1] = conn-in; /* write to child's stdin */ strbuf_release(cmd); -free(url); -if (free_path) -free(path); This end of function, free everything and return conn, could we re-arange so that it is in the code only once ? That would be quite simple now; just place the part after the first return into the else branch. That opens opportunities to move variable declarations from the top of the function into the else branch. But all of these changes should go into a separate commit, IMO, so that the function splitting that happens here can be verified more easily. -- Hannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html