Stefan Kaltenbrunner <[EMAIL PROTECTED]> writes:
> animal: lionfish            warnings: 16
> scan.l:180: warning, the character range [<80>-<FF>] is ambiguous in a
> case-insensitive scanner
> scan.l:180: warning, the character range [<80>-<FF>] is ambiguous in a
> case-insensitive scanner
> scan.l:302: warning, the character range [<80>-<FF>] is ambiguous in a
> case-insensitive scanner

This is evidently complaining about plpgsql's scan.l, which specifies
%option case-insensitive
and then defines
ident_start             [A-Za-z\200-\377_]
which is the way we do it in the main grammar too.  But I've never
seen this message in any of the flex versions I've used with PG.
(Which flex version is installed on lionfish anyway?)

I find some relevant points in the flex manual:
http://flex.sourceforge.net/manual/Patterns.html

  Character classes are expanded immediately when seen in the flex
  input. This means the character classes are sensitive to the locale in
  which flex is executed, and the resulting scanner will not be sensitive
  to the runtime locale. This may or may not be desirable.
  
  Character classes with ranges, such as `[a-Z]', should be used with
  caution in a case-insensitive scanner if the range spans upper or
  lowercase characters. Flex does not know if you want to fold all upper
  and lowercase characters together, or if you want the literal numeric
  range specified (with no case folding). When in doubt, flex will assume
  that you meant the literal numeric range, and will issue a warning. The
  exception to this rule is a character range such as `[a-z]' or `[S-W]'
  where it is obvious that you want case-folding to occur.

What I suspect is happening is that lionfish is running the buildfarm
script in a non-C locale, in which flex finds that some high-bit-set
characters are case-folded by tolower() and accordingly issues this
complaint.  Now the statements that "it assumes you meant the literal
numeric range" and that the behavior is fully determined at compile time
(ie, no run-time invocations of tolower(), as indeed are not to be seen
in pl_scan.c) seem to mean that we'll get the behavior we want anyway.
But the warning is a bit nervous-making.

I wonder if it'd be a good idea to invoke flex with a command like
        LANG=C flex ...
to try to improve the odds that it sees C locale when it's figuring
out what "case insensitive" means.

Anyone want to look into it more closely?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to