Re: [HACKERS] Better to dump tabs as tabs, or \t?

2006-05-27 Thread Marko Kreen

On 5/27/06, Tom Lane [EMAIL PROTECTED] wrote:

Historically pg_dump has taken pains to dump ASCII control characters
as backslash constructs, for instance \t for tab.  I am thinking this
is not such a great idea, and that it'd be more portable rather than
less so if we got rid of that logic and just dumped tab as tab, etc.
In particular, making this play nice with standard_conforming_strings
seems unpleasant: we'll have to emit E'' strings which are certainly
not portable, not even to older PG releases.


Could we just give a switch to pg_dump, which toggles between
standard_confirming_strings and old escaped strings?

IMHO this decision is similar to COPY/INSERT decision - it depends
what the admin plans to with the dump, what tools are user on it,
whether there is need to reload on older postgres, etc - and all
of them are things that the postgres tools cannot deduce.

By default, pg_dump should output standard_conforming_strings,
that being in sync with policy to move to standard SQL quoting.

And when the switch is given, pg_dump should put SET at the
start of the dump, not use E'' stings, so giving option
for being backwards compatible.

Such option would considerably lower the pain of migrating data
between versions.

--
marko

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Better to dump tabs as tabs, or \t?

2006-05-27 Thread Tom Lane
Marko Kreen [EMAIL PROTECTED] writes:
 On 5/27/06, Tom Lane [EMAIL PROTECTED] wrote:
 Historically pg_dump has taken pains to dump ASCII control characters
 as backslash constructs, for instance \t for tab.  I am thinking this
 is not such a great idea, and that it'd be more portable rather than
 less so if we got rid of that logic and just dumped tab as tab, etc.
 In particular, making this play nice with standard_conforming_strings
 seems unpleasant: we'll have to emit E'' strings which are certainly
 not portable, not even to older PG releases.

 Could we just give a switch to pg_dump, which toggles between
 standard_confirming_strings and old escaped strings?

The plan is that it'll dump according to what it finds as the
standard_conforming_strings setting on the source server.
If you feel a need to override that setting, you can use PGOPTIONS
or the other usual ways to set a GUC variable for a program.

However, my thought on the point at hand is to just go over to
dumping control characters literally in either case.  This is
backwards-compatible to all PG versions and I don't know of a
reason to think it wouldn't work (at least as well as the backslash
constructs anyway) for portability to other databases.

Note: this only affects strings dumped as part of SQL commands;
COPY data isn't at issue, since we're not planning to change the
semantics of that.  COPY has always dumped tab as \t and I don't
intend to change it.  But pg_dump --inserts would be affected,
also strings appearing in view definitions and such.

We have some precedent for this in that pg_dump has by default
dumped function definitions as $$ literals for a release or two
now, and no one's complained of whitespace getting munged in
function definitions.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Better to dump tabs as tabs, or \t?

2006-05-27 Thread Bruce Momjian

COPY wants \r and \n to be used because it checks for line endings, but
your change is only for the SQL strings, and you are right, it is more
porable to dump as actual bytes than backslashes.

---

Tom Lane wrote:
 Marko Kreen [EMAIL PROTECTED] writes:
  On 5/27/06, Tom Lane [EMAIL PROTECTED] wrote:
  Historically pg_dump has taken pains to dump ASCII control characters
  as backslash constructs, for instance \t for tab.  I am thinking this
  is not such a great idea, and that it'd be more portable rather than
  less so if we got rid of that logic and just dumped tab as tab, etc.
  In particular, making this play nice with standard_conforming_strings
  seems unpleasant: we'll have to emit E'' strings which are certainly
  not portable, not even to older PG releases.
 
  Could we just give a switch to pg_dump, which toggles between
  standard_confirming_strings and old escaped strings?
 
 The plan is that it'll dump according to what it finds as the
 standard_conforming_strings setting on the source server.
 If you feel a need to override that setting, you can use PGOPTIONS
 or the other usual ways to set a GUC variable for a program.
 
 However, my thought on the point at hand is to just go over to
 dumping control characters literally in either case.  This is
 backwards-compatible to all PG versions and I don't know of a
 reason to think it wouldn't work (at least as well as the backslash
 constructs anyway) for portability to other databases.
 
 Note: this only affects strings dumped as part of SQL commands;
 COPY data isn't at issue, since we're not planning to change the
 semantics of that.  COPY has always dumped tab as \t and I don't
 intend to change it.  But pg_dump --inserts would be affected,
 also strings appearing in view definitions and such.
 
 We have some precedent for this in that pg_dump has by default
 dumped function definitions as $$ literals for a release or two
 now, and no one's complained of whitespace getting munged in
 function definitions.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly