Re: [HACKERS] Better to dump tabs as tabs, or \t?
On 5/27/06, Tom Lane [EMAIL PROTECTED] wrote: Historically pg_dump has taken pains to dump ASCII control characters as backslash constructs, for instance \t for tab. I am thinking this is not such a great idea, and that it'd be more portable rather than less so if we got rid of that logic and just dumped tab as tab, etc. In particular, making this play nice with standard_conforming_strings seems unpleasant: we'll have to emit E'' strings which are certainly not portable, not even to older PG releases. Could we just give a switch to pg_dump, which toggles between standard_confirming_strings and old escaped strings? IMHO this decision is similar to COPY/INSERT decision - it depends what the admin plans to with the dump, what tools are user on it, whether there is need to reload on older postgres, etc - and all of them are things that the postgres tools cannot deduce. By default, pg_dump should output standard_conforming_strings, that being in sync with policy to move to standard SQL quoting. And when the switch is given, pg_dump should put SET at the start of the dump, not use E'' stings, so giving option for being backwards compatible. Such option would considerably lower the pain of migrating data between versions. -- marko ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Better to dump tabs as tabs, or \t?
Marko Kreen [EMAIL PROTECTED] writes: On 5/27/06, Tom Lane [EMAIL PROTECTED] wrote: Historically pg_dump has taken pains to dump ASCII control characters as backslash constructs, for instance \t for tab. I am thinking this is not such a great idea, and that it'd be more portable rather than less so if we got rid of that logic and just dumped tab as tab, etc. In particular, making this play nice with standard_conforming_strings seems unpleasant: we'll have to emit E'' strings which are certainly not portable, not even to older PG releases. Could we just give a switch to pg_dump, which toggles between standard_confirming_strings and old escaped strings? The plan is that it'll dump according to what it finds as the standard_conforming_strings setting on the source server. If you feel a need to override that setting, you can use PGOPTIONS or the other usual ways to set a GUC variable for a program. However, my thought on the point at hand is to just go over to dumping control characters literally in either case. This is backwards-compatible to all PG versions and I don't know of a reason to think it wouldn't work (at least as well as the backslash constructs anyway) for portability to other databases. Note: this only affects strings dumped as part of SQL commands; COPY data isn't at issue, since we're not planning to change the semantics of that. COPY has always dumped tab as \t and I don't intend to change it. But pg_dump --inserts would be affected, also strings appearing in view definitions and such. We have some precedent for this in that pg_dump has by default dumped function definitions as $$ literals for a release or two now, and no one's complained of whitespace getting munged in function definitions. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Better to dump tabs as tabs, or \t?
COPY wants \r and \n to be used because it checks for line endings, but your change is only for the SQL strings, and you are right, it is more porable to dump as actual bytes than backslashes. --- Tom Lane wrote: Marko Kreen [EMAIL PROTECTED] writes: On 5/27/06, Tom Lane [EMAIL PROTECTED] wrote: Historically pg_dump has taken pains to dump ASCII control characters as backslash constructs, for instance \t for tab. I am thinking this is not such a great idea, and that it'd be more portable rather than less so if we got rid of that logic and just dumped tab as tab, etc. In particular, making this play nice with standard_conforming_strings seems unpleasant: we'll have to emit E'' strings which are certainly not portable, not even to older PG releases. Could we just give a switch to pg_dump, which toggles between standard_confirming_strings and old escaped strings? The plan is that it'll dump according to what it finds as the standard_conforming_strings setting on the source server. If you feel a need to override that setting, you can use PGOPTIONS or the other usual ways to set a GUC variable for a program. However, my thought on the point at hand is to just go over to dumping control characters literally in either case. This is backwards-compatible to all PG versions and I don't know of a reason to think it wouldn't work (at least as well as the backslash constructs anyway) for portability to other databases. Note: this only affects strings dumped as part of SQL commands; COPY data isn't at issue, since we're not planning to change the semantics of that. COPY has always dumped tab as \t and I don't intend to change it. But pg_dump --inserts would be affected, also strings appearing in view definitions and such. We have some precedent for this in that pg_dump has by default dumped function definitions as $$ literals for a release or two now, and no one's complained of whitespace getting munged in function definitions. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Bruce Momjian http://candle.pha.pa.us EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly