On 2013/09/24 22:29:56, janek wrote:
https://codereview.appspot.com/13256053/diff/1/lily/lexer.ll File lily/lexer.ll (right):
https://codereview.appspot.com/13256053/diff/1/lily/lexer.ll#newcode588
lily/lexer.ll:588: [^|*.=$#{}\"\\ \t\n\r\f0-9][^$#{}\"\\ \t\n\r\f0-9]*
{
On 2013/09/24 16:15:39, dak wrote: > On 2013/09/24 07:44:44, janek wrote: > > Hmm. From my point of view, this deserves some comment (but i
don't insist).
> > I have a problem thinking of a comment that would add any
information that is
> not immediately apparent from the pattern. > > Suggestions?
Maybe "because of regex greediness, we have to additionally exclude
patterns
beginning with |*.= (without this \lyricmode { \skip 1.*3 } fails)"?
Well, multiple matching regexps are messy and might call for an appropriate comment. But the whole point of the change is not to have multiple matching regexps any more. The original commit message of the commit that failed to do the job was: commit 38a4081efa4a8ee2f5da780ca0ed2991627afc46 Author: David Kastrup <d...@gnu.org> Date: Sun Sep 30 02:21:00 2012 +0200 Issue 2869: Regularize lyrics lexer mode That makes lyrics mode rather similar to markup mode regarding how words are formed. {} are never considered part of words unless enclosed in quotes. Unquoted words do not contain whitespace, braces, quotes, backslashes, numbers or Scheme expressions. In addition, they cannot start with * . = and | since that would mess with duration, assignment and barcheck syntax. This removes some remaining TeX-oriented cruft in the lexer. The set of word-non-starters might need revisiting, but at least the regtests seem to pass. The text that is relevant here is "In addition, [unquoted words] cannot start with * . = and | since that would mess with duration, assignment and barcheck syntax." After the fix, the pattern _explicitly_ (rather than implicitly by pattern order which did not work) states that unquoted words cannot start with * . = and |. The pattern is now a literal rendition of the description.
For me the regex itself required a few moments of thinking to
understand. The problem is that you want a comment describing the problems with code/patterns that is no longer there. I don't think that's helpful. A comment is called for when you use a trick to avoid a problem. But in this case, the problem was avoided by stopping to use a trick and instead being boringly explicit in the pattern. It's quite redundant now to state "this pattern won't match something starting with * . = or |" since the pattern explicitly excludes those characters in its first position. My first attempt was to give the first single-char pattern priority back by using [*.=]/.* (a "trailing context" which did not actually work because of technical restrictions). That would both have been fragile _and_ would have called for a comment in order to explain its purpose. But the purpose of a pattern [^*.=| ... is not to match * . = | in the first position of the pattern. That's basic. Comments can't hope to explain everything that's basic. They should tell the story that the code doesn't tell, at least not on its most direct surface level. And the code now tells the story in the most blunt way that is possible. Of course we can add a comment of the "we could do this in a trickier way that does not actually work" kind, but that's not helpful at all. It does not belong in the code. It might belong in the issue tracker, perhaps in the commit message. Just as a record of what went wrong. But the code, as it is written, does not offer a temptation of changing it back to the buggy previous version. It was not more elegant or obvious or clearer. https://codereview.appspot.com/13256053/ _______________________________________________ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel