On Tue, May 28, 2013 at 8:58 PM, Anthony Ramine <n.ox...@gmail.com> wrote:
> Case folding is not done correctly when matching against the [:upper:]
> character class and uppercased character ranges (e.g. A-Z).
> Specifically, an uppercase letter fails to match against any of them
> when case folding is requested because plain characters in the pattern
> and the whole string and preemptively lowercased to handle the base case
> fast.

I did a little test with glibc fnmatch and also checked the source
code. I don't think 'a' matches [:upper:]. So I'm not sure if that's a
correct behavior or a bug in glibc. The spec is not clear (I think) on
this. I guess we should just assume that 'a' should match '[:upper:]'?

> @@ -196,6 +196,11 @@ static int dowild(const uchar *p, const uchar *text, 
> unsigned int flags)
>                                         }
>                                         if (t_ch <= p_ch && t_ch >= prev_ch)
>                                                 matched = 1;
> +                                       else if ((flags & WM_CASEFOLD) && 
> ISLOWER(t_ch)) {
> +                                               uchar t_ch_upper = 
> toupper(t_ch);
> +                                               if (t_ch_upper <= p_ch && 
> t_ch_upper >= prev_ch)
> +                                                       matched = 1;
> +                                       }

Or we could stick with to tolower. Something like this

if ((t_ch <= p_ch && t_ch >= prev_ch) ||
   ((flags & WM_CASEFOLD) &&
      t_ch <= tolower(p_ch) && t_ch >= tolower(prev_ch)))
   match = 1;

I think it's easier to read if we either downcase all, or upcase all, not both.

>                                         p_ch = 0; /* This makes "prev_ch" get 
> set to 0. */
>                                 } else if (p_ch == '[' && p[1] == ':') {
>                                         const uchar *s;
> @@ -245,6 +250,8 @@ static int dowild(const uchar *p, const uchar *text, 
> unsigned int flags)
>                                         } else if (CC_EQ(s,i, "upper")) {
>                                                 if (ISUPPER(t_ch))
>                                                         matched = 1;
> +                                               else if ((flags & 
> WM_CASEFOLD) && ISLOWER(t_ch))
> +                                                       matched = 1;
>                                         } else if (CC_EQ(s,i, "xdigit")) {
>                                                 if (ISXDIGIT(t_ch))
>                                                         matched = 1;

If WM_CASEFOLD is set, maybe isalpha(t_ch) is enough then?
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to