On Thu, Feb 13, 2020 at 1:25 PM Robert Helling <[email protected]> wrote:
>
> Maybe somebody with a better understanding of the code (Linus?) can help me 
> out with this.

Hmm. That

   event 40:00 type=8 divemode="OC" name="modechange"

line actually violates the original git save format thing - each line
is supposed to have only one string in it. That's because the parser
of each line is really simple, and just breaks on the next whitespace.

See the comment above "parse_one_string()":

 * We have a very simple line-based interface, with the small
 * complication that lines can have strings in the middle, and
 * a string can be multiple lines.
 *
 * The UTF-8 string escaping is *very* simple, though:
 *
 *  - a string starts and ends with double quotes (")
 *
 *  - inside the string we escape:
 *     (a) double quotes with '\"'
 *     (b) backslash (\) with '\\'
 *
 *  - additionally, for human readability, we escape
 *    newlines with '\n\t', with the exception that
 *    consecutive newlines are left unescaped (so an
 *    empty line doesn't become a line with just a tab
 *    on it).
 *
 * Also, while the UTF-8 string can have arbitrarily
 * long lines, the non-string parts of the lines are
 * never long, so we can use a small temporary buffer
 * on stack for that part.
 *
 * Also, note that if a line has one or more strings
 * in it:
 *
 *  - each string will be represented as a single '"'
 *    character in the output.
 *
 *  - all string will exist in the same 'membuffer',
 *    separated by NUL characters (that cannot exist
 *    in a string, not even quoted).

And what happens is that now that there are *two* strings on that
line, the string buffer will have that second case of the membuffer
rule:

  - all string will exist in the same 'membuffer',
    separated by NUL characters (that cannot exist
    in a string, not even quoted).

and so now the 'str' membuffer actually contains the buffer
"OC\0modechange\0", but the event parsing code assumes that there was
at most one string, and that one string was the name.

The "OC" string confuses it, and makes it use "OC" as the name.

Very unfortunate. This problem was introduced when Willem Ferguson
started saving the dive mode in commit b9174332d ("Read and write
divemode changes (xml and git)"), and nobody realized that "oops, now
it's corrupting the event name".

The only thing that used to have multiple strings is the "dive flags"
thing, and that one carefully walks all the strings in the membuffer.
See parse_dive_tags().

Ho humm.

Does this extremely ugly patch fix it for you?

               Linus
 core/load-git.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/core/load-git.c b/core/load-git.c
index efdb52940..ac19e4c0a 100644
--- a/core/load-git.c
+++ b/core/load-git.c
@@ -796,9 +796,26 @@ static void parse_dc_event(char *line, struct membuffer *str, struct git_parser_
 		line = parse_keyvalue_entry(parse_event_keyvalue, &event, line);
 	}
 
+	/*
+	 * The name is the last string in our membuffer.
+	 *
+	 * We used to have just one string there, but now
+	 * we may have the "divemode" string too.
+	 *
+	 * So walk from the end of the membuffer until
+	 * you either hit the beginning, or you hit a NUL
+	 * character.  That is the name string.
+	 */
 	name = "";
-	if (str->len)
-		name = mb_cstring(str);
+	if (str->len) {
+		const char *begin, *end;
+		begin = mb_cstring(str);
+		end = begin + str->len - 1;
+		while (end > begin && *--end)
+			/* nothing */;
+		name = end;
+	}
+
 	ev = add_event(state->active_dc, event.time.seconds, event.type, event.flags, event.value, name);
 
 	/*
_______________________________________________
subsurface mailing list
[email protected]
http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface

Reply via email to