Follow-up Comment #1, bug #60977 (project groff): Current WIP.
commit fd18b00bba9a4faf0318270380f5cceb86eba5b7 Author: G. Branden Robinson <[email protected]> AuthorDate: Tue Jul 27 16:51:21 2021 +1000 Commit: G. Branden Robinson <[email protected]> CommitDate: Tue Jul 27 17:46:05 2021 +1000 [troff]: Fix Savannah #60977. [troff]: Avoid using sprintf() with user-controlled format string. * src/preproc/html/pre-html.cpp (makeFileName): Add comment noting need for implementation synchrony between this function, which generates \O5 escape sequences, and troff's suppress_node::tprint member function, which interprets them. * src/roff/troff/node.cpp: Rename 2 static globals for clarity. - `last_image_filename` -> `image_filename` - `last_image_id` -> `subimage_counter` (suppress_node::tprint): Set up the buffer for image file name to be written using a constant rather than an embedded literal. Unconditionally initialize the buffer with a string terminator, so there is no chance of a read from unitialized storage. Drop unused code involving `tem`. Drop stale comments. Clarify comment: an `image_filename` doesn't _always_ contain a format string, only sometimes. Replace use of `sprintf` with manual construction of a new image filename string. There are two cases, one where a format string {presently "%d"} is present, and one where it is not. If it is present, locate it {`percent`}. This means a limited/bounded image ("subimage") is being processed; increment the subimage counter. Write a new image file name preserving the parts before and after "%d" (the "prefix" and "suffix", and replacing only the middle, using `sprintf` with the subimage counter and the (string literal) format. Be mindful of string bounds and memory allocation, issuing diagnostics or aborting as necessary. If the image file name does _not_ contain a format string, but needs only to be copied, do that (`strcpy`), again instead of using `sprintf`. Fixes <https://savannah.gnu.org/bugs/?60977>. Implementation note: * Why all these standard C library string manipulations? (1) I'm more familiar with C than C++. (2) The inbound type I'm dealing with, {last,}image_filename, is a char*, not a C++ string type. (3) groff does not use a standard C++ library string type, but its own implementation[1]; groff is _old_). (4) That custom string class does not have any `printf`-style format methods that I can see. Incidentally, groff's own `i_to_a` library function[2] doesn't handle long ints, which `size_t` is, so you can't put longs into diagnostic messages among other places. (It looks easy to change `i_to_a` and `ui_to_a` to support longs. You can then watch the build explode into a thousand pieces with numeric overflows in troff.) (5) I did attempt to handle the strings efficiently, never calling strcat() since I always knew where the end of a string I wanted to append to was[3]. [1] src/include/stringclass.h [2] src/libs/libgroff/itoa.c [3] https://symas.com/the-sad-state-of-c-strings/ diff --git a/src/preproc/html/pre-html.cpp b/src/preproc/html/pre-html.cpp index 88fcca18..501e90cb 100644 --- a/src/preproc/html/pre-html.cpp +++ b/src/preproc/html/pre-html.cpp @@ -544,6 +544,7 @@ static void makeFileName(void) if (image_template == NULL) sys_fatal("malloc"); strcpy(image_template, macroset_template); + // Keep this format string synced with troff:suppress_node::tprint(). strcat(image_template, "%d"); } diff --git a/src/roff/troff/node.cpp b/src/roff/troff/node.cpp index f2fc6da8..86b054e8 100644 --- a/src/roff/troff/node.cpp +++ b/src/roff/troff/node.cpp @@ -4053,8 +4053,8 @@ void suppress_node::put(troff_output_file *out, const char *s) */ static char last_position = 0; -static const char *last_image_filename = 0; -static int last_image_id = 0; +static const char *image_filename = 0; +static int subimage_counter = 0; inline int min(int a, int b) { @@ -4086,23 +4086,62 @@ void suppress_node::tprint(troff_output_file *out) if (is_on == 2) { // remember position and filename last_position = position; - char *tem = (char *)last_image_filename; - last_image_filename = strsave(filename.contents()); - if (tem) - free(tem); - last_image_id = image_id; - // printf("start of image and page = %d\n", current_page); + image_filename = strsave(filename.contents()); } else { - // now check whether the suppress node requires us to issue limits. + // Now check whether the suppress node requires us to issue limits. if (emit_limits) { - char name[8192]; - // remember that the filename will contain a %d in which the - // last_image_id is placed - if (last_image_filename == (char *) 0) - *name = '\0'; - else - sprintf(name, last_image_filename, last_image_id); + const size_t namebuflen = 8192; + char name[namebuflen] = { '\0' }; + // Jump through a flaming hoop to avoid a "format nonliteral" + // warning from blindly using sprintf...and avoid trouble from + // mischievous image stems. + // + // Keep this format string synced with pre-html:makeFileName(). + const char format[] = "%d"; + const size_t format_len = strlen(format); + const char *percent_position = strstr(image_filename, format); + if (percent_position) { + subimage_counter++; + assert(sizeof subimage_counter <= 8); + // A 64-bit signed int produces up to 19 decimal digits. + char *subimage_number = (char *)malloc(20); // 19 digits + \0 + if (0 == subimage_number) + fatal("memory allocation failure"); + // Replace the %d in the filename with this number. + size_t enough = strlen(image_filename) + 19 - format_len; + char *new_name = (char *)malloc(enough); + if (0 == new_name) + fatal("memory allocation failure"); + ptrdiff_t prefix_length = percent_position - image_filename; + strncpy(new_name, image_filename, prefix_length); + sprintf(subimage_number, "%d", subimage_counter); + size_t number_length = strlen(subimage_number); + strcpy(new_name + prefix_length, subimage_number); + // Skip over the format in the source string. + const char *suffix_src = image_filename + prefix_length + + format_len; + char *suffix_dst = new_name + prefix_length + number_length; + strcpy(suffix_dst, suffix_src); + // Ensure the new string fits with room for a terminal '\0'. + const size_t len = strlen(new_name); + if (len > (namebuflen - 1)) + error("constructed file name in suppressed output escape is" + " too long (>= %1 bytes); skipping image", + (int)namebuflen); + else + strncpy(name, new_name, (namebuflen - 1)); + free(new_name); + free(subimage_number); + } + else { + const size_t len = strlen(image_filename); + if (len > (namebuflen - 1)) + error("file name in suppressed output escape is too long" + " (>= %1 bytes); skipping image", (int)namebuflen); + else + strcpy(name, image_filename); + } if (is_html) { switch (last_position) { case 'c': _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?60977> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
