Re: [PATCH] sub-process: refactor handshake to common function
Lars Schneiderwrites: > Please note that I've recently refactored the capabilities negotiation a bit: > https://github.com/git/git/commit/1514c8edd62d96006cd1de31e906ed5798dd4681 > > This change is still cooking in `next`. I am not sure how this should/could > be handled but maybe you can use my refactoring as your base? I think they can play well together as independent topics. The known_caps[] thing you introduced essentially is the same as the struct subprocess_capability capablities[] Jonathan has. Please check $ git show "pu^{/jt/subprocess-handshake' into pu}" convert.c to see if it makes sense. Thanks.
Re: [PATCH] sub-process: refactor handshake to common function
> On 24 Jul 2017, at 23:38, Jonathan Tanwrote: > > Refactor, into a common function, the version and capability negotiation > done when invoking a long-running process as a clean or smudge filter. > This will be useful for other Git code that needs to interact similarly > with a long-running process. > > Signed-off-by: Jonathan Tan > --- > This will be useful for anyone implementing functionality like that in > [1]. > > It is unfortunate that the code grew larger because it had to be more > generic, but I think this is better than having (in the future) 2 or > more separate handshake functions. > > I also don't think that the protocol should be permissive - I think > things would be simpler if all protocol errors were fatal errors. As it > is, most parts are permissive, but packet_read_line() already dies > anyway upon a malformed packet, so it may not be too drastic a change to > change this. For reference, the original protocol comes from [2]. > > [1] > https://public-inbox.org/git/20170714132651.170708-2-benpe...@microsoft.com/ > [2] edcc858 ("convert: add filter..process option", 2016-10-17) Thanks for this refactoring! That's great! Please note that I've recently refactored the capabilities negotiation a bit: https://github.com/git/git/commit/1514c8edd62d96006cd1de31e906ed5798dd4681 This change is still cooking in `next`. I am not sure how this should/could be handled but maybe you can use my refactoring as your base? Thanks, Lars > --- > convert.c | 61 - > pkt-line.c| 19 --- > pkt-line.h| 2 -- > sub-process.c | 94 +++ > sub-process.h | 18 ++ > t/t0021-conversion.sh | 2 +- > 6 files changed, 120 insertions(+), 76 deletions(-) > > diff --git a/convert.c b/convert.c > index deaf0ba7b..ec8ecc2ea 100644 > --- a/convert.c > +++ b/convert.c > @@ -512,62 +512,15 @@ static struct hashmap subprocess_map; > > static int start_multi_file_filter_fn(struct subprocess_entry *subprocess) > { > - int err; > + static int versions[] = {2, 0}; > + static struct subprocess_capability capabilities[] = { > + {"clean", CAP_CLEAN}, {"smudge", CAP_SMUDGE}, {NULL, 0} > + }; > struct cmd2process *entry = (struct cmd2process *)subprocess; > - struct string_list cap_list = STRING_LIST_INIT_NODUP; > - char *cap_buf; > - const char *cap_name; > - struct child_process *process = >process; > - const char *cmd = subprocess->cmd; > - > - sigchain_push(SIGPIPE, SIG_IGN); > - > - err = packet_writel(process->in, "git-filter-client", "version=2", > NULL); > - if (err) > - goto done; > - > - err = strcmp(packet_read_line(process->out, NULL), "git-filter-server"); > - if (err) { > - error("external filter '%s' does not support filter protocol > version 2", cmd); > - goto done; > - } > - err = strcmp(packet_read_line(process->out, NULL), "version=2"); > - if (err) > - goto done; > - err = packet_read_line(process->out, NULL) != NULL; > - if (err) > - goto done; > - > - err = packet_writel(process->in, "capability=clean", > "capability=smudge", NULL); > - > - for (;;) { > - cap_buf = packet_read_line(process->out, NULL); > - if (!cap_buf) > - break; > - string_list_split_in_place(_list, cap_buf, '=', 1); > - > - if (cap_list.nr != 2 || strcmp(cap_list.items[0].string, > "capability")) > - continue; > - > - cap_name = cap_list.items[1].string; > - if (!strcmp(cap_name, "clean")) { > - entry->supported_capabilities |= CAP_CLEAN; > - } else if (!strcmp(cap_name, "smudge")) { > - entry->supported_capabilities |= CAP_SMUDGE; > - } else { > - warning( > - "external filter '%s' requested unsupported > filter capability '%s'", > - cmd, cap_name > - ); > - } > - > - string_list_clear(_list, 0); > - } > - > -done: > - sigchain_pop(SIGPIPE); > > - return err; > + return subprocess_handshake(subprocess, "git-filter-", versions, NULL, > + capabilities, > + >supported_capabilities); > } > > static int apply_multi_file_filter(const char *path, const char *src, size_t > len, > diff --git a/pkt-line.c b/pkt-line.c > index 9d845ecc3..7db911957 100644 > --- a/pkt-line.c > +++ b/pkt-line.c > @@ -171,25 +171,6 @@ int packet_write_fmt_gently(int fd, const char *fmt, ...) > return status; > } > > -int packet_writel(int fd, const char *line, ...) > -{ > - va_list args; > - int err; > -
Re: [PATCH] sub-process: refactor handshake to common function
On Tue, 25 Jul 2017 10:38:51 -0400 Ben Peartwrote: > Christian Couder has been working on a similar refactoring for the perl > versions of very similar helper functions. They can be found in the > following patch series: > > https://public-inbox.org/git/20170620075523.26961-1-chrisc...@tuxfamily.org/ > > In particular: > > - Patches 02/49 to 08/49 create a Git/Packet.pm module by >refactoring "t0021/rot13-filter.pl". Functions from this new >module will be used later in test scripts. > > Some differences I noticed: this patch does both the version and > capability negotiation and it lives in "sub-process.h/c" files. In the > perl script patch series, the version negotiation is in the new > "packet.pm" module but it does not include the capability negotiation. > > It would make sense to me for these to have the same API and usage > behaviors. Perhaps pull them together into a single patch series so > that we can review and keep them in sync? Thanks for the pointer. These are different languages and different use cases (mine for production, his for tests), though, so I don't think consistency in API and behavior is practical. > > It is unfortunate that the code grew larger because it had to be more > > generic, but I think this is better than having (in the future) 2 or > > more separate handshake functions. > > I'm always happy to see an effort to refactor common code into reusable > APIs. Its a good engineering practice that makes it easier to > stabilize, extend and maintain the code. The challenge is making sure > the common function supports all the requirements of the various callers > and that the increase in complexity doesn't outweigh the benefit of > centralizing the logic. Agreed - thanks. > > +int subprocess_handshake(struct subprocess_entry *entry, > > +const char *welcome_prefix, > > +int *versions, > > +int *chosen_version, > > +struct subprocess_capability *capabilities, > > +unsigned int *supported_capabilities) { > > + int version_scratch; > > + unsigned int capabilities_scratch; > > + struct child_process *process = >process; > > + int i; > > + char *line; > > + const char *p; > > + > > + if (!chosen_version) > > + chosen_version = _scratch; > > + if (!supported_capabilities) > > + supported_capabilities = _scratch; > > + > > + sigchain_push(SIGPIPE, SIG_IGN); > > + > > + if (packet_write_fmt_gently(process->in, "%sclient\n", > > + welcome_prefix)) { > > + error("Could not write client identification"); > > + goto error; > > + } > > + for (i = 0; versions[i]; i++) { > > + if (packet_write_fmt_gently(process->in, "version=%d\n", > > + versions[i])) { > > + error("Could not write requested version"); > > + goto error; > > + } > > + } > > + if (packet_flush_gently(process->in)) > > + goto error; > > + > > + if (!(line = packet_read_line(process->out, NULL)) || > > + !skip_prefix(line, welcome_prefix, ) || > > + strcmp(p, "server")) { > > + error("Unexpected line '%s', expected %sserver", > > + line ? line : "", welcome_prefix); > > + goto error; > > + } > > + if (!(line = packet_read_line(process->out, NULL)) || > > + !skip_prefix(line, "version=", ) || > > + strtol_i(p, 10, chosen_version)) { > > + error("Unexpected line '%s', expected version", > > + line ? line : ""); > > + goto error; > > + } > > + for (i = 0; versions[i]; i++) { > > + if (versions[i] == *chosen_version) > > + goto version_found; > > + } > > This doesn't look like it supports negotiating a common version between > the client and server. If a client supports version 1, 2, and 3 and the > server only supports version 1 and 2, which version and capabilities > will be used? In the protocol, the client sends a list of versions it supports (see the "for" loop above), and expects the server to write the single version that the server chooses (see the part where we read one single version line). In your example, the client would write "version=1\n", "version=2\n", and "version=3\n", and the server would write "version=2\n" (well, it could write "version=1\n" too). So version 2 (or 1) will be used.
Re: [PATCH] sub-process: refactor handshake to common function
On 7/24/2017 5:38 PM, Jonathan Tan wrote: Refactor, into a common function, the version and capability negotiation done when invoking a long-running process as a clean or smudge filter. This will be useful for other Git code that needs to interact similarly with a long-running process. Christian Couder has been working on a similar refactoring for the perl versions of very similar helper functions. They can be found in the following patch series: https://public-inbox.org/git/20170620075523.26961-1-chrisc...@tuxfamily.org/ In particular: - Patches 02/49 to 08/49 create a Git/Packet.pm module by refactoring "t0021/rot13-filter.pl". Functions from this new module will be used later in test scripts. Some differences I noticed: this patch does both the version and capability negotiation and it lives in "sub-process.h/c" files. In the perl script patch series, the version negotiation is in the new "packet.pm" module but it does not include the capability negotiation. It would make sense to me for these to have the same API and usage behaviors. Perhaps pull them together into a single patch series so that we can review and keep them in sync? Signed-off-by: Jonathan Tan--- This will be useful for anyone implementing functionality like that in [1]. It is unfortunate that the code grew larger because it had to be more generic, but I think this is better than having (in the future) 2 or more separate handshake functions. I'm always happy to see an effort to refactor common code into reusable APIs. Its a good engineering practice that makes it easier to stabilize, extend and maintain the code. The challenge is making sure the common function supports all the requirements of the various callers and that the increase in complexity doesn't outweigh the benefit of centralizing the logic. I also don't think that the protocol should be permissive - I think things would be simpler if all protocol errors were fatal errors. As it is, most parts are permissive, but packet_read_line() already dies anyway upon a malformed packet, so it may not be too drastic a change to change this. For reference, the original protocol comes from [2]. [1] https://public-inbox.org/git/20170714132651.170708-2-benpe...@microsoft.com/ [2] edcc858 ("convert: add filter..process option", 2016-10-17) --- convert.c | 61 - pkt-line.c| 19 --- pkt-line.h| 2 -- sub-process.c | 94 +++ sub-process.h | 18 ++ t/t0021-conversion.sh | 2 +- 6 files changed, 120 insertions(+), 76 deletions(-) diff --git a/convert.c b/convert.c index deaf0ba7b..ec8ecc2ea 100644 --- a/convert.c +++ b/convert.c @@ -512,62 +512,15 @@ static struct hashmap subprocess_map; static int start_multi_file_filter_fn(struct subprocess_entry *subprocess) { - int err; + static int versions[] = {2, 0}; + static struct subprocess_capability capabilities[] = { + {"clean", CAP_CLEAN}, {"smudge", CAP_SMUDGE}, {NULL, 0} + }; struct cmd2process *entry = (struct cmd2process *)subprocess; - struct string_list cap_list = STRING_LIST_INIT_NODUP; - char *cap_buf; - const char *cap_name; - struct child_process *process = >process; - const char *cmd = subprocess->cmd; - - sigchain_push(SIGPIPE, SIG_IGN); - - err = packet_writel(process->in, "git-filter-client", "version=2", NULL); - if (err) - goto done; - - err = strcmp(packet_read_line(process->out, NULL), "git-filter-server"); - if (err) { - error("external filter '%s' does not support filter protocol version 2", cmd); - goto done; - } - err = strcmp(packet_read_line(process->out, NULL), "version=2"); - if (err) - goto done; - err = packet_read_line(process->out, NULL) != NULL; - if (err) - goto done; - - err = packet_writel(process->in, "capability=clean", "capability=smudge", NULL); - - for (;;) { - cap_buf = packet_read_line(process->out, NULL); - if (!cap_buf) - break; - string_list_split_in_place(_list, cap_buf, '=', 1); - - if (cap_list.nr != 2 || strcmp(cap_list.items[0].string, "capability")) - continue; - - cap_name = cap_list.items[1].string; - if (!strcmp(cap_name, "clean")) { - entry->supported_capabilities |= CAP_CLEAN; - } else if (!strcmp(cap_name, "smudge")) { - entry->supported_capabilities |= CAP_SMUDGE; - } else { - warning( - "external filter '%s' requested unsupported filter capability '%s'", -
Re: [PATCH] sub-process: refactor handshake to common function
Hi, Jonathan Tan wrote: > Refactor, into a common function, the version and capability negotiation > done when invoking a long-running process as a clean or smudge filter. > This will be useful for other Git code that needs to interact similarly > with a long-running process. > > Signed-off-by: Jonathan Tan> --- Sounds like a sensible thing to do. [...] > --- a/sub-process.h > +++ b/sub-process.h > @@ -18,6 +18,11 @@ struct subprocess_entry { > struct child_process process; > }; > > +struct subprocess_capability { > + const char *name; > + unsigned int flag; What does this flag represent? What values can it have? A comment might help. [...] > @@ -41,6 +46,19 @@ static inline struct child_process > *subprocess_get_child_process( > return >process; > } > > +/* > + * Perform the handshake to a long-running process as described in the > + * gitattributes documentation using the given requested versions and > + * capabilities. The "versions" and "capabilities" parameters are arrays > + * terminated by a 0 or blank struct. > + */ > +int subprocess_handshake(struct subprocess_entry *entry, > + const char *welcome_prefix, > + int *versions, > + int *chosen_version, > + struct subprocess_capability *capabilities, > + unsigned int *supported_capabilities); > + Is there a more precise pointer within the gitattributes documentation that describes what this does? I searched for "handshake" and found nothing. Is the "Long Running Filter Process" section where I should have started? The API docs for this header are currently in Documentation/technical/api-sub-process.txt. Perhaps these docs should also go there, or, even better, the docs in Documentation/technical/ could move to this header in a preparatory patch. Aside (not about this patch): why is the subprocess object called struct subprocess_entry? Would it make sense to rename it to struct subprocess? Back to this function. Is this a function I should only call once, when first launching a subprocess, or can I call it again later? A note about suggested usage might help. [...] > --- a/t/t0021-conversion.sh > +++ b/t/t0021-conversion.sh > @@ -697,7 +697,7 @@ test_expect_success PERL 'invalid process filter must > fail (and not hang!)' ' > > cp "$TEST_ROOT/test.o" test.r && > test_must_fail git add . 2>git-stderr.log && > - grep "does not support filter protocol version" git-stderr.log > + grep "expected git-filter-server" git-stderr.log This error message looks a little less friendly than the old one. Is that okay because it is not supposed to happen in practice? [...] > --- a/convert.c > +++ b/convert.c > @@ -512,62 +512,15 @@ static struct hashmap subprocess_map; > > static int start_multi_file_filter_fn(struct subprocess_entry *subprocess) > { > - int err; > - struct cmd2process *entry = (struct cmd2process *)subprocess; [... many lines snipped ...] > - return err; > + static int versions[] = {2, 0}; > + static struct subprocess_capability capabilities[] = { > + {"clean", CAP_CLEAN}, {"smudge", CAP_SMUDGE}, {NULL, 0} > + }; > + struct cmd2process *entry = (struct cmd2process *)subprocess; > + > + return subprocess_handshake(subprocess, "git-filter-", versions, NULL, > + capabilities, > + >supported_capabilities); > } API looks nice. [...] > --- a/sub-process.c > +++ b/sub-process.c > @@ -105,3 +105,97 @@ int subprocess_start(struct hashmap *hashmap, struct > subprocess_entry *entry, co Implementation looks sane from a quick glance. Thanks, Jonathan