2016-01-20 10:17 GMT+01:00 Tatsuo Ishii <is...@postgresql.org>:

> > Hi
> >
> > 2016-01-20 7:20 GMT+01:00 Tatsuo Ishii <is...@postgresql.org>:
> >
> >> > 2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <is...@postgresql.org>:
> >> >
> >> >> test=# select format('%I', t) from t1;
> >> >>   format
> >> >> ----------
> >> >>  aaa
> >> >>  "AAA"
> >> >>  "あいう"
> >> >> (3 rows)
> >> >>
> >> >> Why is the text value of the third line needed to be double quoted?
> >> >> (note that it is a multi byte character). Same thing can be said to
> >> >> quote_ident().
> >> >>
> >> >> We treat identifiers made of the multi byte characters without double
> >> >> quotation (non delimited identifier) in other places.
> >> >>
> >> >> test=# create table t2(あいう text);
> >> >> CREATE TABLE
> >> >> test=# insert into t2 values('aaa');
> >> >> INSERT 0 1
> >> >> test=# select あいう from t2;
> >> >>  あいう
> >> >> --------
> >> >>  aaa
> >> >> (1 row)
> >> >
> >> > format uses same routine as quote_ident. So quote_ident should be
> fixed
> >> > first.
> >>
> >> Yes, I had that in my mind too.
> >>
> >> Attached is the proposed patch to fix the bug.
> >> Regression tests passed.
> >>
> >> Here is an example after the patch. Note that the third row is not
> >> quoted any more.
> >>
> >> test=#  select format('%I', あいう) from t2;
> >>  format
> >> --------
> >>  aaa
> >>  "AAA"
> >>  あああ
> >> (3 rows)
> >>
> >> Best regards,
> >> --
> >> Tatsuo Ishii
> >> SRA OSS, Inc. Japan
> >> English: http://www.sraoss.co.jp/index_en.php
> >> Japanese:http://www.sraoss.co.jp
> >>
> >> diff --git a/src/backend/utils/adt/ruleutils.c
> >> b/src/backend/utils/adt/ruleutils.c
> >> index 3783e97..b93fc27 100644
> >> --- a/src/backend/utils/adt/ruleutils.c
> >> +++ b/src/backend/utils/adt/ruleutils.c
> >> @@ -9405,7 +9405,7 @@ quote_identifier(const char *ident)
> >>          * would like to use <ctype.h> macros here, but they might yield
> >> unwanted
> >>          * locale-specific results...
> >>          */
> >> -       safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] ==
> '_');
> >> +       safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_'
> ||
> >> IS_HIGHBIT_SET(ident[0]));
> >>
> >>         for (ptr = ident; *ptr; ptr++)
> >>         {
> >> @@ -9413,7 +9413,8 @@ quote_identifier(const char *ident)
> >>
> >>                 if ((ch >= 'a' && ch <= 'z') ||
> >>                         (ch >= '0' && ch <= '9') ||
> >> -                       (ch == '_'))
> >> +                       (ch == '_') ||
> >> +                       (IS_HIGHBIT_SET(ch)))
> >>                 {
> >>                         /* okay */
> >>                 }
> >>
> >>
> > This patch ls simply - I remember I was surprised, so we allow any
> > multibyte char few months ago.
> >
> > +1
>
> If we would go this way, question is if we should back patch this or
> not since the patch apparently changes the existing
> behaviors. Comments?  I would think we should not.
>

I am sure, so we should not backport this change. This can breaks customer
regress tests - and the current behave isn't 100% correct, but it is safe.

Pavel


>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>

Reply via email to