Re: [HACKERS] Unicode mapping scripts cleanup

2016-10-27 Thread Peter Eisentraut
On 9/1/15 12:13 AM, Peter Eisentraut wrote: > ere is a series of patches to clean up the Unicode mapping script > business in src/backend/utils/mb/Unicode/. I never committed the last of these patches, which have the download locations of the files. I have updated this a bit now and propose it

Re: [HACKERS] Unicode collations in FreeBSD 11, DragonFly BSD 4.4 without ICU

2015-12-09 Thread Robert Haas
On Sun, Dec 6, 2015 at 4:45 PM, Thomas Munro wrote: > Since the ICU patch from the BSD ports trees has been discussed on > this list a few times I thought people might be interested in this > news: > >

[HACKERS] Unicode collations in FreeBSD 11, DragonFly BSD 4.4 without ICU

2015-12-06 Thread Thomas Munro
Hi Since the ICU patch from the BSD ports trees has been discussed on this list a few times I thought people might be interested in this news: https://lists.freebsd.org/pipermail/freebsd-current/2015-October/057781.html https://www.dragonflybsd.org/release44/

Re: [HACKERS] Unicode mapping scripts cleanup

2015-10-03 Thread Andres Freund
Hi, On 2015-09-01 00:13:07 -0400, Peter Eisentraut wrote: > Here is a series of patches to clean up the Unicode mapping script > business in src/backend/utils/mb/Unicode/. It overlaps with the > perlcritic work that I recently wrote about, except that these pieces > are not strictly related to

Re: [HACKERS] Unicode mapping scripts cleanup

2015-09-16 Thread Tatsuo Ishii
> What if we discovered that one of our mappings was wrong? Suppose > that there is some encoding where the Unicode mapping for "a" is > inadvertently mapped to the letter "b" in some other character set, > and "b" is mapped to "a". I imagine that anyone using that encoding > would want this

Re: [HACKERS] Unicode mapping scripts cleanup

2015-09-16 Thread Robert Haas
On Tue, Sep 15, 2015 at 9:00 PM, Tatsuo Ishii wrote: >> Then again, I don't have >> any knowledge about how to handle such changes. But the fact that the >> standards bodies are still making changes indicates that such changes >> are to be expected and should be handled.

Re: [HACKERS] Unicode mapping scripts cleanup

2015-09-15 Thread Peter Eisentraut
On 9/1/15 7:27 PM, Tatsuo Ishii wrote: >> On Tue, Sep 1, 2015 at 5:13 AM, Peter Eisentraut wrote: >>> So apparently, the >>> CJK to Unicode mappings are still evolving and should be updated >>> occasionally. Next steps would be to commit some or all of these >>> differences

Re: [HACKERS] Unicode mapping scripts cleanup

2015-09-15 Thread Tatsuo Ishii
>> I think so. We must be very careful updating the maps. Adding new >> mapping data would cause less problem, but replacing existing mappings >> will be definitely a big problem for users. > > Note that I'm not actually proposing to change the mappings, I just want > to get the scripts into

Re: [HACKERS] Unicode mapping scripts cleanup

2015-09-01 Thread Tatsuo Ishii
> On Tue, Sep 1, 2015 at 5:13 AM, Peter Eisentraut wrote: >> So apparently, the >> CJK to Unicode mappings are still evolving and should be updated >> occasionally. Next steps would be to commit some or all of these >> differences after additional verification, and then update

Re: [HACKERS] Unicode mapping scripts cleanup

2015-09-01 Thread Tatsuo Ishii
> I discovered that some of the source files that one is supposed to > download don't exist anymore or are labeled obsolete. Also, running the > scripts produces slight differences in the output. So apparently, the > CJK to Unicode mappings are still evolving and should be updated >

Re: [HACKERS] Unicode mapping scripts cleanup

2015-09-01 Thread Greg Stark
On Tue, Sep 1, 2015 at 5:13 AM, Peter Eisentraut wrote: > So apparently, the > CJK to Unicode mappings are still evolving and should be updated > occasionally. Next steps would be to commit some or all of these > differences after additional verification, and then update the

[HACKERS] Unicode mapping scripts cleanup

2015-08-31 Thread Peter Eisentraut
Here is a series of patches to clean up the Unicode mapping script business in src/backend/utils/mb/Unicode/. It overlaps with the perlcritic work that I recently wrote about, except that these pieces are not strictly related to Perl, but wrong comments, missing makefile pieces, and such. I

Re: [HACKERS] unicode questions

2009-12-25 Thread - -
On Thu, Dec 24, 2009 at 5:40 PM, Andrew Dunstan and...@dunslane.net wrote: 1) If I set my database and connection encoding to UTF-8, does pg (and future versions of it) guarantee that unicode code points are stored unmodified? or could it be that pg does some unicode normalization/manipulation

[HACKERS] unicode questions

2009-12-24 Thread - -
Dear PG hackers, I have two question regarding Unicode support in PG: 1) If I set my database and connection encoding to UTF-8, does pg (and future versions of it) guarantee that unicode code points are stored unmodified? or could it be that pg does some unicode normalization/manipulation with

Re: [HACKERS] unicode questions

2009-12-24 Thread Andrew Dunstan
- - wrote: Dear PG hackers, I have two question regarding Unicode support in PG: 1) If I set my database and connection encoding to UTF-8, does pg (and future versions of it) guarantee that unicode code points are stored unmodified? or could it be that pg does some unicode

Re: [HACKERS] unicode questions

2009-12-24 Thread Andrew Dunstan
2) How far is normalization support in PG? When I checked a long time ago, there was no such support. Now that the SQL standard mandates a NORMALIZE function that may have changed. Any updates? Creating such a function shouldn't be terribly hard AIUI, if someone wants to submit a

Re: [HACKERS] unicode questions

2009-12-24 Thread David E. Wheeler
On Dec 24, 2009, at 4:14 PM, Andrew Dunstan wrote: 2) How far is normalization support in PG? When I checked a long time ago, there was no such support. Now that the SQL standard mandates a NORMALIZE function that may have changed. Any updates? Creating such a function shouldn't be

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-23 Thread Peter Eisentraut
On sön, 2009-11-22 at 00:23 -0500, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: Attached is an updated patch with a couple of tweaks to ensure output is formatted and spaced correctly when border=0, which was off in the last patch. Applied wih minor editorialization.

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-23 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes: What is the plan behind keeping the old format? Are we going to remove it after one release if no one complains, or are we seriously expecting that someone has code that actually parses this? Plan? Do we need a plan? The extra support consists of

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-21 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes: Attached is an updated patch with a couple of tweaks to ensure output is formatted and spaced correctly when border=0, which was off in the last patch. Applied wih minor editorialization. Notably, I renamed the backwards-compatible option from

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-15 Thread Roger Leigh
On Sun, Nov 15, 2009 at 12:50:14AM +, Roger Leigh wrote: On Sat, Nov 14, 2009 at 01:31:29PM -0500, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: The side effect from this change is that some of the testsuite expected data will need updating due to the extra pad spaces

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-14 Thread Roger Leigh
On Mon, Nov 09, 2009 at 05:40:54PM -0500, Bruce Momjian wrote: Tom Lane wrote: Greg Stark gsst...@mit.edu writes: While i agree this looks nicer I wonder what it does to things like excel/gnumeric/ooffice auto-recognizing table layouts and importing files. I'm not sure our old format

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-14 Thread Roger Leigh
On Sat, Nov 14, 2009 at 05:40:24PM +, Roger Leigh wrote: On Mon, Nov 09, 2009 at 05:40:54PM -0500, Bruce Momjian wrote: Tom Lane wrote: Greg Stark gsst...@mit.edu writes: While i agree this looks nicer I wonder what it does to things like excel/gnumeric/ooffice auto-recognizing

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-14 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes: The side effect from this change is that some of the testsuite expected data will need updating due to the extra pad spaces No, we are *not* doing that. Somebody made a change to the print.c logic last year that started adding harmless white space to

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-14 Thread Roger Leigh
On Sat, Nov 14, 2009 at 01:31:29PM -0500, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: The side effect from this change is that some of the testsuite expected data will need updating due to the extra pad spaces No, we are *not* doing that. Somebody made a change to the print.c

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-09 Thread Bruce Momjian
Tom Lane wrote: Greg Stark gsst...@mit.edu writes: While i agree this looks nicer I wonder what it does to things like excel/gnumeric/ooffice auto-recognizing table layouts and importing files. I'm not sure our old format was so great for this so maybe this is actually an improvement I'm

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-09 Thread Alvaro Herrera
Andrew Dunstan wrote: You can set the field separator to ',' but you can't do a \pset format csv and get CSV with correct quoting, escaping etc AFAICS. It'll still break on line wrapping if wrapping is enabled, and with newlines in the data. If that would be a useful addition, I can

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-31 Thread Roger Leigh
On Mon, Oct 26, 2009 at 01:33:19PM -0400, Tom Lane wrote: Greg Stark gsst...@mit.edu writes: While i agree this looks nicer I wonder what it does to things like excel/gnumeric/ooffice auto-recognizing table layouts and importing files. I'm not sure our old format was so great for this so

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-31 Thread Greg Stark
On Mon, Oct 26, 2009 at 11:43 PM, Peter Eisentraut pete...@gmx.net wrote: On Mon, 2009-10-26 at 10:12 -0700, Greg Stark wrote: While i agree this looks nicer I wonder what it does to things like excel/gnumeric/ooffice auto-recognizing table layouts and importing files. I'm not sure our old

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-31 Thread Roger Leigh
On Sat, Oct 31, 2009 at 05:11:10AM -0700, Greg Stark wrote: On Mon, Oct 26, 2009 at 11:43 PM, Peter Eisentraut pete...@gmx.net wrote: On Mon, 2009-10-26 at 10:12 -0700, Greg Stark wrote: While i agree this looks nicer I wonder what it does to things like excel/gnumeric/ooffice

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-31 Thread Andrew Dunstan
Roger Leigh wrote: Wouldn't it be much simpler all around to add a csv output format in addition to the above for this purpose? Spreadsheets can read it in with no trouble at all. We've had CSV output since version 8.0. cheers andrew -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-31 Thread Roger Leigh
On Sat, Oct 31, 2009 at 12:25:22PM -0400, Andrew Dunstan wrote: Roger Leigh wrote: Wouldn't it be much simpler all around to add a csv output format in addition to the above for this purpose? Spreadsheets can read it in with no trouble at all. We've had CSV output since version 8.0.

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-31 Thread Andrew Dunstan
Roger Leigh wrote: On Sat, Oct 31, 2009 at 12:25:22PM -0400, Andrew Dunstan wrote: Roger Leigh wrote: Wouldn't it be much simpler all around to add a csv output format in addition to the above for this purpose? Spreadsheets can read it in with no trouble at all. We've had

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-27 Thread Peter Eisentraut
On Mon, 2009-10-26 at 10:12 -0700, Greg Stark wrote: While i agree this looks nicer I wonder what it does to things like excel/gnumeric/ooffice auto-recognizing table layouts and importing files. I'm not sure our old format was so great for this so maybe this is actually an improvement I'm

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-27 Thread Roger Leigh
On Mon, Oct 26, 2009 at 11:33:40PM +, Roger Leigh wrote: On Mon, Oct 26, 2009 at 07:19:24PM -0400, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: On Mon, Oct 26, 2009 at 01:33:19PM -0400, Tom Lane wrote: Yeah. We can do what we like with the UTF8 format but I'm considerably

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-26 Thread Peter Eisentraut
On sön, 2009-10-25 at 23:48 +, Roger Leigh wrote: Just for reference, this is what the output looks like (abridged) using the attached patch. Should display fine if your mail client handles UTF-8 messages correctly: rleigh=# \l List of databases

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-26 Thread Greg Stark
2009/10/25 Roger Leigh rle...@codelibre.net: rleigh=# \l                                     List of databases      Name       │  Owner   │ Encoding │  Collation  │    Ctype    │   Access privileges ─┼──┼──┼─┼─┼───  

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-26 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: While i agree this looks nicer I wonder what it does to things like excel/gnumeric/ooffice auto-recognizing table layouts and importing files. I'm not sure our old format was so great for this so maybe this is actually an improvement I'm asking for. Yeah.

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-26 Thread Roger Leigh
On Mon, Oct 26, 2009 at 01:33:19PM -0400, Tom Lane wrote: Greg Stark gsst...@mit.edu writes: While i agree this looks nicer I wonder what it does to things like excel/gnumeric/ooffice auto-recognizing table layouts and importing files. I'm not sure our old format was so great for this so

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-26 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes: On Mon, Oct 26, 2009 at 01:33:19PM -0400, Tom Lane wrote: Yeah. We can do what we like with the UTF8 format but I'm considerably more worried about the aspect of making random changes to the plain-ASCII output. I checked (using strace) gnumeric (via

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-26 Thread Roger Leigh
On Mon, Oct 26, 2009 at 07:19:24PM -0400, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: On Mon, Oct 26, 2009 at 01:33:19PM -0400, Tom Lane wrote: Yeah. We can do what we like with the UTF8 format but I'm considerably more worried about the aspect of making random changes to the

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-25 Thread Roger Leigh
On Sat, Oct 24, 2009 at 06:23:24PM +0100, Roger Leigh wrote: On Fri, Oct 16, 2009 at 01:38:15PM +0300, Peter Eisentraut wrote: I like the new Unicode tables, but the marking of continuation lines looks pretty horrible: List of databases Name │

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-25 Thread Roger Leigh
On Sun, Oct 25, 2009 at 11:48:27PM +, Roger Leigh wrote: There's just one tiny display glitch I can see, and that's if you have mixed wrapping and newlines, you miss the lefthand wrap mark if the line is the last wrapped line and it ends in a newline. It might not be possible to pick that

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-24 Thread Roger Leigh
On Fri, Oct 16, 2009 at 01:38:15PM +0300, Peter Eisentraut wrote: I like the new Unicode tables, but the marking of continuation lines looks pretty horrible: List of databases Name │ Owner │ Encoding │ Collation │ Ctype │ Access privileges

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-16 Thread Peter Eisentraut
(HTML mail to preserve formatting; let's see if it works.) I like the new Unicode tables, but the marking of continuation lines looks pretty horrible: List of databases Name │ Owner │ Encoding │ Collation │ Ctype │ Access privileges

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-16 Thread Roger Leigh
On Fri, Oct 16, 2009 at 01:38:15PM +0300, Peter Eisentraut wrote: (HTML mail to preserve formatting; let's see if it works.) I like the new Unicode tables, but the marking of continuation lines looks pretty horrible: Yes, I'm not so keen myself. The ASCII characters used are '|', ':' and '

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-14 Thread Roger Leigh
On Tue, Oct 13, 2009 at 05:08:20PM -0400, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: The attached updated patch renames all user-visible uses of utf8 to unicode. It also updates the documentation regarding locale to psql client character set encoding so the docs now match

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-13 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes: The attached updated patch renames all user-visible uses of utf8 to unicode. It also updates the documentation regarding locale to psql client character set encoding so the docs now match the code exactly. Applied with light editorialization. The

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-11 Thread Roger Leigh
On Fri, Oct 09, 2009 at 04:35:46PM -0500, Kevin Grittner wrote: Peter Eisentraut pete...@gmx.net wrote: I think the setting ought be called linestyle unicode (instead of utf8), since the same setting would presumably work in case we ever implement UTF-16 support on the client side.

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-09 Thread Peter Eisentraut
On Tue, 2009-10-06 at 19:35 +0100, Roger Leigh wrote: This patch included a bit of code not intended for inclusion (setting of client encoding based on locale), which the attached (and hopefully final!) revision of the patch excludes. Well, the documentation still claims that this is dependent

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-09 Thread Kevin Grittner
Peter Eisentraut pete...@gmx.net wrote: I think the setting ought be called linestyle unicode (instead of utf8), since the same setting would presumably work in case we ever implement UTF-16 support on the client side. Yeah, anytime one gets sloppy with the distinction between a character

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-07 Thread Brad T. Sliger
On Tuesday 06 October 2009 11:35:03 Roger Leigh wrote: On Tue, Oct 06, 2009 at 10:44:27AM +0100, Roger Leigh wrote: On Mon, Oct 05, 2009 at 04:32:08PM -0400, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: On Sun, Oct 04, 2009 at 11:22:27PM +0300, Peter Eisentraut wrote:

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-06 Thread Roger Leigh
On Mon, Oct 05, 2009 at 04:32:08PM -0400, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: On Sun, Oct 04, 2009 at 11:22:27PM +0300, Peter Eisentraut wrote: Elsewhere in the psql code, notably in mbprint.c, we make the decision on whether to apply certain Unicode-aware processing

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-06 Thread Roger Leigh
On Tue, Oct 06, 2009 at 10:44:27AM +0100, Roger Leigh wrote: On Mon, Oct 05, 2009 at 04:32:08PM -0400, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: On Sun, Oct 04, 2009 at 11:22:27PM +0300, Peter Eisentraut wrote: Elsewhere in the psql code, notably in mbprint.c, we make the

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-05 Thread Roger Leigh
On Sun, Oct 04, 2009 at 11:22:27PM +0300, Peter Eisentraut wrote: I have a comment on this bit: @@ -125,6 +128,17 @@ main(int argc, char *argv[]) /* We rely on unmentioned fields of pset.popt to start out 0/false/NULL */ pset.popt.topt.format = PRINT_ALIGNED; + +

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-05 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes: On Sun, Oct 04, 2009 at 11:22:27PM +0300, Peter Eisentraut wrote: Elsewhere in the psql code, notably in mbprint.c, we make the decision on whether to apply certain Unicode-aware processing based on whether the client encoding is UTF8. The same should

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-04 Thread Roger Leigh
On Fri, Oct 02, 2009 at 05:34:16PM -0700, Brad T. Sliger wrote: On Friday 02 October 2009 04:21:35 Roger Leigh wrote: I have attached a patch which implements the feature as a pset variable. This also slightly simplifies some of the patch since the table style is passed to functions

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-04 Thread Peter Eisentraut
I have a comment on this bit: @@ -125,6 +128,17 @@ main(int argc, char *argv[]) /* We rely on unmentioned fields of pset.popt to start out 0/false/NULL */ pset.popt.topt.format = PRINT_ALIGNED; + + /* Default table style to plain ASCII */ +

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-02 Thread Roger Leigh
On Wed, Sep 30, 2009 at 06:50:46PM -0400, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote: Thinking about this some more, ISTM a much better way of approaching it would be to provide a flag for psql to turn off the fancy

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-02 Thread Brad T. Sliger
On Friday 02 October 2009 04:21:35 Roger Leigh wrote: On Wed, Sep 30, 2009 at 06:50:46PM -0400, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote: Thinking about this some more, ISTM a much better way of approaching it

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Roger Leigh
On Tue, Sep 29, 2009 at 04:28:57PM -0400, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On Tue, 2009-09-29 at 12:01 -0400, Tom Lane wrote: The bigger question is exactly how we expect this stuff to interact with pg_regress' --no-locale switch. We already do clear all these

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Andrew Dunstan
Roger Leigh wrote: Here we just force the locale to C. This does have the disadvantage that --no-locale is made redundant, and any tests which are dependent upon locale (if any?) will be run in the C locale. That is not a solution. We have not that long ago gone to some lengths to

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes: Roger Leigh wrote: Here we just force the locale to C. This does have the disadvantage that --no-locale is made redundant, and any tests which are dependent upon locale (if any?) will be run in the C locale. That is not a solution. Right. I think

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Roger Leigh
On Tue, Sep 29, 2009 at 04:32:49PM -0400, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: C locale means POSIX behavior and nothing but. Indeed it does. However, making LC_CTYPE be UTF-8 rather than ASCII is both possible and still strictly conforming to the letter of the

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Andrew Dunstan
Andrew Dunstan wrote: Roger Leigh wrote: Here we just force the locale to C. This does have the disadvantage that --no-locale is made redundant, and any tests which are dependent upon locale (if any?) will be run in the C locale. That is not a solution. We have not that long ago

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes: The language in SUSv2 in fact explicitly states that this is allowed. In fact, I've seen documentation that some UNIX systems such as HPUX already do have a UTF-8 C locale as an option. I don't argue with the concept of a C.UTF8 locale --- in fact I

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes: Thinking about this some more, ISTM a much better way of approaching it would be to provide a flag for psql to turn off the fancy formatting, and have pg_regress use that flag. Yeah, that's not a bad idea. There are likely to be other client

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Robert Haas
On Wed, Sep 30, 2009 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: Thinking about this some more, ISTM a much better way of approaching it would be to provide a flag for psql to turn off the fancy formatting, and have pg_regress use that flag.

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Andrew Dunstan
Robert Haas wrote: On Wed, Sep 30, 2009 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: Thinking about this some more, ISTM a much better way of approaching it would be to provide a flag for psql to turn off the fancy formatting, and have

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Peter Eisentraut
On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote: Thinking about this some more, ISTM a much better way of approaching it would be to provide a flag for psql to turn off the fancy formatting, and have pg_regress use that flag. Well, it might not be a bad idea, but adding a feature

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Peter Eisentraut
On Wed, 2009-09-30 at 12:06 -0400, Robert Haas wrote: On Wed, Sep 30, 2009 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: Thinking about this some more, ISTM a much better way of approaching it would be to provide a flag for psql to turn off the

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Andrew Dunstan
Peter Eisentraut wrote: On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote: Thinking about this some more, ISTM a much better way of approaching it would be to provide a flag for psql to turn off the fancy formatting, and have pg_regress use that flag. Well, it might not be a

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Alvaro Herrera
Andrew Dunstan escribió: Peter Eisentraut wrote: On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote: Thinking about this some more, ISTM a much better way of approaching it would be to provide a flag for psql to turn off the fancy formatting, and have pg_regress use that flag.

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Robert Haas
On Wed, Sep 30, 2009 at 1:27 PM, Peter Eisentraut pete...@gmx.net wrote: On Wed, 2009-09-30 at 12:06 -0400, Robert Haas wrote: On Wed, Sep 30, 2009 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: Thinking about this some more, ISTM a much better way

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Roger Leigh
On Wed, Sep 30, 2009 at 01:30:17PM -0400, Andrew Dunstan wrote: Peter Eisentraut wrote: On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote: Thinking about this some more, ISTM a much better way of approaching it would be to provide a flag for psql to turn off the fancy

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Peter Eisentraut
On Wed, 2009-09-30 at 14:02 -0400, Alvaro Herrera wrote: All scripts I've seen parsing psql output use unaligned, undecorated mode. I have yet to see one messing with the |'s. Plus, we would have broken that with the : continuation lines. -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: The way it works right now (and has worked for the last 5 years or more) is reliable, familiar, and, at least IME, bullet-proof. I don't even see a good case for changing the default, let alone not providing a way to retreat to the old behavior. This

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes: On Wed, 2009-09-30 at 14:02 -0400, Alvaro Herrera wrote: All scripts I've seen parsing psql output use unaligned, undecorated mode. I have yet to see one messing with the |'s. Plus, we would have broken that with the : continuation lines. Only for

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes: On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote: Thinking about this some more, ISTM a much better way of approaching it would be to provide a flag for psql to turn off the fancy formatting, and have pg_regress use that flag. The attached

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Peter Eisentraut
On Wed, 2009-09-30 at 18:50 -0400, Tom Lane wrote: It would be a good idea to tie this to a psql magic variable (like ON_ERROR_STOP) so that it could conveniently be set in ~/.psqlrc. I'm not actually sure that we need a dedicated command line switch for it, since you could use -v varname

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Peter Eisentraut
On Mon, 2009-09-28 at 20:49 -0700, Brad T. Sliger wrote: During this review I found that `gmake check` will fail when LANG=en_US.UTF-8 in the environment. In this case the patched psql produces UTF8 line art and the tests expect ASCII line art. pg_regress clears LC_ALL by default,

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes: On Mon, 2009-09-28 at 20:49 -0700, Brad T. Sliger wrote: pg_regress clears LC_ALL by default, but does not clear LANG by default. Please find attached a patch that causes pg_regress to also clear LANG by default. It probably doesn't matter much, but

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Roger Leigh
On Tue, Sep 29, 2009 at 12:01:30PM -0400, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On Mon, 2009-09-28 at 20:49 -0700, Brad T. Sliger wrote: pg_regress clears LC_ALL by default, but does not clear LANG by default. Please find attached a patch that causes pg_regress to

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes: In Debian, we do have plans to introduce a C.UTF-8 locale, Egad, isn't that a contradiction in terms? C locale means POSIX behavior and nothing but. regards, tom lane -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Roger Leigh
On Tue, Sep 29, 2009 at 01:41:27PM -0400, Tom Lane wrote: Roger Leigh rle...@codelibre.net writes: In Debian, we do have plans to introduce a C.UTF-8 locale, Egad, isn't that a contradiction in terms? Not entirely! C locale means POSIX behavior and nothing but. Indeed it does. However,

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Peter Eisentraut
On Tue, 2009-09-29 at 12:01 -0400, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On Mon, 2009-09-28 at 20:49 -0700, Brad T. Sliger wrote: pg_regress clears LC_ALL by default, but does not clear LANG by default. Please find attached a patch that causes pg_regress to also clear

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes: On Tue, 2009-09-29 at 12:01 -0400, Tom Lane wrote: The bigger question is exactly how we expect this stuff to interact with pg_regress' --no-locale switch. We already do clear all these variables when --no-locale is specified. I am wondering just what

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes: C locale means POSIX behavior and nothing but. Indeed it does. However, making LC_CTYPE be UTF-8 rather than ASCII is both possible and still strictly conforming to the letter of the standard. There would be some collation and other restrictions

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Robert Haas
On Tue, Sep 29, 2009 at 4:28 PM, Tom Lane t...@sss.pgh.pa.us wrote: Peter Eisentraut pete...@gmx.net writes: On Tue, 2009-09-29 at 12:01 -0400, Tom Lane wrote: The bigger question is exactly how we expect this stuff to interact with pg_regress' --no-locale switch.  We already do clear all

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Alvaro Herrera
Robert Haas escribió: On Tue, Sep 29, 2009 at 4:28 PM, Tom Lane t...@sss.pgh.pa.us wrote: Peter Eisentraut pete...@gmx.net writes: On Tue, 2009-09-29 at 12:01 -0400, Tom Lane wrote: The bigger question is exactly how we expect this stuff to interact with pg_regress' --no-locale switch.  

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes: Robert Haas escribió: This seems to mean that we can't apply this patch, since failing the regression tests is not an acceptable behavior. Does the patch pass regression tests in normal conditions? If you consider that normal means LANG=C in

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Andrew Dunstan
Alvaro Herrera wrote: Does the patch pass regression tests in normal conditions? If it does, I see no reason to reject it. If it fails in --locale only, and even then only when the given locale is UTF8, which IIRC it's a seldom-used case, we can see about fixing that separately.

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Alvaro Herrera
Tom Lane escribió: Alvaro Herrera alvhe...@commandprompt.com writes: Robert Haas escribi�: This seems to mean that we can't apply this patch, since failing the regression tests is not an acceptable behavior. Does the patch pass regression tests in normal conditions? If you consider

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-28 Thread Brad T. Sliger
On Sunday 27 September 2009 19:03:33 Robert Haas wrote: On Sun, Sep 27, 2009 at 9:24 PM, Selena Deckelmann selenama...@gmail.com wrote: Hi! On Wed, Sep 23, 2009 at 2:16 AM, Roger Leigh rle...@codelibre.net wrote: On Fri, Sep 18, 2009 at 11:30:05AM -0700, Selena Deckelmann wrote: Brad

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-27 Thread Selena Deckelmann
Hi! On Wed, Sep 23, 2009 at 2:16 AM, Roger Leigh rle...@codelibre.net wrote: On Fri, Sep 18, 2009 at 11:30:05AM -0700, Selena Deckelmann wrote: Brad says:        The patched code compiles without any additional warnings. Lint gripes about a trailing ',' in 'typedef enum printTextRule' in

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-27 Thread Robert Haas
On Sun, Sep 27, 2009 at 9:24 PM, Selena Deckelmann selenama...@gmail.com wrote: Hi! On Wed, Sep 23, 2009 at 2:16 AM, Roger Leigh rle...@codelibre.net wrote: On Fri, Sep 18, 2009 at 11:30:05AM -0700, Selena Deckelmann wrote: Brad says:        The patched code compiles without any additional

Re: [HACKERS] Unicode Normalization

2009-09-24 Thread David E. Wheeler
On Sep 24, 2009, at 6:24 AM, p...@thetdh.com wrote: In a context using normalization, wouldn't you typically want to store a normalized-text type that could perhaps (depending on locale) take advantage of simpler, more-efficient comparison functions? That might be nice, but I'd be wary

Re: [HACKERS] Unicode Normalization

2009-09-24 Thread Andrew Dunstan
David E. Wheeler wrote: On Sep 24, 2009, at 6:24 AM, p...@thetdh.com wrote: In a context using normalization, wouldn't you typically want to store a normalized-text type that could perhaps (depending on locale) take advantage of simpler, more-efficient comparison functions? That might be

Re: [HACKERS] Unicode Normalization

2009-09-24 Thread David E. Wheeler
On Sep 24, 2009, at 8:59 AM, Andrew Dunstan wrote: That might be nice, but I'd be wary of a geometric multiplication of text types. We already have TEXT and CITEXT; what if we had your NTEXT (normalized text) but I wanted it to also be case-insensitive? Actually, I don't think it's

Re: [HACKERS] Unicode Normalization

2009-09-24 Thread pg
In a context using normalization, wouldn't you typically want to store a normalized-text type that could perhaps (depending on locale) take advantage of simpler, more-efficient comparison functions? Whether you're doing INSERT/UPDATE, or importing a flat text file, if you canonicalize

  1   2   3   4   >