On 19/01/2014 14:51, Dimitar Zhekov wrote: > On Sat, 18 Jan 2014 17:10:37 +0200 > Arthur Rosenstein <artros.m...@gmail.com> wrote: > >> I took a quick glance at patch 11 [...] It treats compiler "column >> numbers" as if they're actually column numbers, while in reality >> they almost never are. Which means that it's not going to work >> right as soon as the code contains tabs or multibyte characters. > > Hmmm... :) > > <tab><tab>(s - text == 10 && "боза" xxx ... [utf-8] > > utils.c:643:33: error: expected ')' before 'xxx' > > Double click on the message: seeks to the first 'x', Geany status > bar shows line 643, column 43 (column # depends on the tab size). > > But it *does* seek to 'а' if "боза" is locale.
Indeed, my bad. You're not treating "column numbers" as column numbers but rather UTF-8 byte offsets. That's definitely more common, but not universal still. > >> Since it doesn't use named capture groups for the message >> parsing, it's a little less flexible in the type of messages >> it can parse > > line [column] filename, or filename line [column] > > If no regex is specified (and most people leave it blank I think), > the "fallback" Geany parser requires only one of filename or line > (default filename == that of the current document). > > With a regex, both the filename and line are mandatory (though the > regex Geany parser can be modified to make the line optional). > > P11 adds an optional column without altering the above behaviour. > I think it's a better idea to reduce the surface area by moving everything to the regex based parser and getting rid of the hard-coded one. I left it as a fallback just to be on the safe side, but if regex based parsing proves to work well for a release or two it can be safely dropped. >> and it also cannot handle different >> styles of error messages in the same regex. > > expr1|expr2|... Won't work. See blow. > > Or if you mean errors vs. warnings, Geany does not support such a > distinction, and P11 only adds an optional column. That's not what I mean. > >> The example given in the documentation is actually >> not going to work when the column number is missing. > > echo "test.py:7:24: E202 whitespace before ']'" | > egrep '([^:]+):([0-9] +):([0-9]+): |([^:]+):([0-9]+): ' > test.py:7:24: E202 whitespace before ']' > > echo "test.py:7: E202 whitespace before ']'" | > egrep '([^:]+):([0-9]+): ([0-9]+): |([^:]+):([0-9]+): ' > test.py:7: E202 whitespace before ']' > Did you actually test it? You're using g_match_info_fetch() to read the submatches of the first three *capture groups*. Those are indexed independently of the string you're trying to match. So, even though the pattern does match the error message in the file:line case, you're reading the source-location info from the capture groups that didn't match anything. Note that in this case the solution is simple enough: "([^:]+):([0-9]+)(?::([0-9]+))?: ", but the general limitation is still there. > Overall, P11 is a minimal implementation, which handles most of > the cases, with very small regex changes (an optional 3rd match). > PR 191 is the maximum implementation. Both have pros and cons. I think this is a slight overstatement. PR191 is really not *that* much bigger or radical, and a lot of the code that it adds is modular and can be used for other things (which I do, FWIW, in the Compiler Errors window for example). -- Cheers, Arthur _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel