Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Andrew Dunstan
Tom Lane said:
 Andrew Dunstan [EMAIL PROTECTED] writes:
 This is just nonsense. There is not the slightest reason that CSV data
  cannot be embedded in a text dump nor exist as the data members of a
 custom or tar dump with the corresponding COPY statements modified
 accordingly.

 Well, the really *core* question here is whether we trust the stability
 of the CSV format definition (and code) enough to want to rely on it
 for data dump/restore purposes.  I'm still a few years away from that,
 myself.  AFAICT the raison d'etre of the CSV code is emit whatever it
 takes to satisfy this, that, and the other broken Microsoft
 application. That's fine as an export tool, but as a dump/reload tool,
 nyet.  If you put it in pg_dump you're just handing neophytes another
 foot-gun.



Well, I'm staggered. Really.

The CSV definition and its lack of formality is a complete red herring in
this, as are references to Microsoft.

The real issue in what you say here is your suggestion that we might not be
able to reproduce the input in some cases via a COPY CSV round trip. If that
is so it's a bug of our (probably my) making, and must be fixed. I assert
that it is not so. In fact all the tests I did during development and since
were premised on recovering the input exactly. The only CSV option that
breaks things in that respect is FORCE NOT NULL, and it is designed for data
coming in from a non Postgres source, so it's not relevant here. Even FORCE
QUOTE won't break things because it never quotes a null value, and the only
semantic significance to us of quoting is the null-ness of the value.

If the code is broken then it should be discoverable by test or code
analysis. There is no need to refer to any other application or standard. So
if you or anyone think there is a case that will not reproduce the data
exactly when the same CSV options are used for output and input, I challenge
you or them to provide a single example.

You say you're a few years away from trusting the code. Well, it's not so
huge that it's beyond analysis, and I'll be happy to explain anything that
puzzles you. Perhaps more importantly, it has been in use now for 18 months.
We discovered one problem with embedded line feeds very early in the 8.0
release cycle, and fixed it. After that I have not heard of a single
problem. And I assure you this code is widely used.

It probably isn't used much as a round trip mechanism, probably in part
because we haven't provided it as a pg_dump option. So maybe we have a
chicken/egg scenario here. We do have some round trip regression tests in
the copy test, and those can be beefed up if necessary to increase your
confidence level.

I'm happy to debate details, but general assertions of we can't trust this
code don't seem worth much to me.

cheers

andrew






---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Bruce Momjian

I agree with Andrew.  If there are bugs in CSV, then we should fix them,
not avoid give users this usability improvement.  I know I have analyzed
the code and fixed a few problems with it in the past.

As far as pg_dump, I am thinking the most flexible solution would be to
allow an arbitrary WITH clause to be added to COPY, like

pg_dump --copy-with CSV

This would allow not only CSV control, but specification of non-CSV
options if desired.

---

Andrew Dunstan wrote:
 Tom Lane said:
  Andrew Dunstan [EMAIL PROTECTED] writes:
  This is just nonsense. There is not the slightest reason that CSV data
   cannot be embedded in a text dump nor exist as the data members of a
  custom or tar dump with the corresponding COPY statements modified
  accordingly.
 
  Well, the really *core* question here is whether we trust the stability
  of the CSV format definition (and code) enough to want to rely on it
  for data dump/restore purposes.  I'm still a few years away from that,
  myself.  AFAICT the raison d'etre of the CSV code is emit whatever it
  takes to satisfy this, that, and the other broken Microsoft
  application. That's fine as an export tool, but as a dump/reload tool,
  nyet.  If you put it in pg_dump you're just handing neophytes another
  foot-gun.
 
 
 
 Well, I'm staggered. Really.
 
 The CSV definition and its lack of formality is a complete red herring in
 this, as are references to Microsoft.
 
 The real issue in what you say here is your suggestion that we might not be
 able to reproduce the input in some cases via a COPY CSV round trip. If that
 is so it's a bug of our (probably my) making, and must be fixed. I assert
 that it is not so. In fact all the tests I did during development and since
 were premised on recovering the input exactly. The only CSV option that
 breaks things in that respect is FORCE NOT NULL, and it is designed for data
 coming in from a non Postgres source, so it's not relevant here. Even FORCE
 QUOTE won't break things because it never quotes a null value, and the only
 semantic significance to us of quoting is the null-ness of the value.
 
 If the code is broken then it should be discoverable by test or code
 analysis. There is no need to refer to any other application or standard. So
 if you or anyone think there is a case that will not reproduce the data
 exactly when the same CSV options are used for output and input, I challenge
 you or them to provide a single example.
 
 You say you're a few years away from trusting the code. Well, it's not so
 huge that it's beyond analysis, and I'll be happy to explain anything that
 puzzles you. Perhaps more importantly, it has been in use now for 18 months.
 We discovered one problem with embedded line feeds very early in the 8.0
 release cycle, and fixed it. After that I have not heard of a single
 problem. And I assure you this code is widely used.
 
 It probably isn't used much as a round trip mechanism, probably in part
 because we haven't provided it as a pg_dump option. So maybe we have a
 chicken/egg scenario here. We do have some round trip regression tests in
 the copy test, and those can be beefed up if necessary to increase your
 confidence level.
 
 I'm happy to debate details, but general assertions of we can't trust this
 code don't seem worth much to me.
 
 cheers
 
 andrew
 
 
 
 
 
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings
 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 I agree with Andrew.  If there are bugs in CSV, then we should fix them,
 not avoid give users this usability improvement.

The case for it being a usability improvement seems very weak to me;
no one has yet demonstrated an actual use-case where someone would pull
CSV data out of pg_dump output instead of just dumping the table
directly with COPY.  Now the anti case is admittedly hypothetical:
I'm supposing that we will eventually be bitten by portability problems
with CSV-style dumps not being loadable into future versions.  But given
the weak nature of the pro case, I think we should be conservative
and not take that risk.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  I agree with Andrew.  If there are bugs in CSV, then we should fix them,
  not avoid give users this usability improvement.
 
 The case for it being a usability improvement seems very weak to me;
 no one has yet demonstrated an actual use-case where someone would pull
 CSV data out of pg_dump output instead of just dumping the table
 directly with COPY.  Now the anti case is admittedly hypothetical:
 I'm supposing that we will eventually be bitten by portability problems
 with CSV-style dumps not being loadable into future versions.  But given
 the weak nature of the pro case, I think we should be conservative
 and not take that risk.

Well, I saw little request for COPY CSV here, but IRC users were
reporting a lot of interest, and feedback from the release that added it
showed it was a major feature, so just because we haven't see use-case
here doesn't mean it doesn't exist.

As was stated before, the use-case for this is by people we don't
normally have contact with.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 As was stated before, the use-case for this is by people we don't
 normally have contact with.

Or who don't exist.

Once again: give me a plausible use-case for pulling this data out of a
pg_dump output file.  You'd need data-massaging code anyway just to
extract the data, so why not expect that code to convert to CSV or
whatever other format you might want?  If you can think of use-cases like
this, why do you think the destination format is necessarily CSV and not
something else?  If it is something else, adding a CSV option to pg_dump
makes it *harder* not easier to write that massaging code, because now
it's got to cope with N dump formats not one.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  As was stated before, the use-case for this is by people we don't
  normally have contact with.
 
 Or who don't exist.
 
 Once again: give me a plausible use-case for pulling this data out of a
 pg_dump output file.  You'd need data-massaging code anyway just to
 extract the data, so why not expect that code to convert to CSV or
 whatever other format you might want?  If you can think of use-cases like
 this, why do you think the destination format is necessarily CSV and not
 something else?  If it is something else, adding a CSV option to pg_dump
 makes it *harder* not easier to write that massaging code, because now
 it's got to cope with N dump formats not one.

I don't have to think of a use case.  I trusted the people who said we
needed CSV, so I trust them again if they say doing pg_dump with CSV
would be a good idea.

Also, my suggestion of --copy-with would allow CSV and other format
modifications with minimal code and complexity.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Martijn van Oosterhout
On Tue, Jun 13, 2006 at 09:14:48AM -0700, Joshua D. Drake wrote:
 I don't get it. If you can use psql then you already have csv support.
 
 psql -c 'COPY pg_class TO STDOUT WITH CSV' postgres  pg_class.csv
 
 If you data looks like this:
 
 foo   barbaz  bing
 
 You are o.k. You have three columns, tab delimited.
 
 However if you data looks like this:
 
 foo   bar baz bing
 
 You have a problem.

I'm not exactly sure what you're getting at here:

postgres=# create temp table x (a text, b text, c text);
CREATE TABLE
postgres=# insert into x values ('blah', 'hello   world', 'hmmm
postgres'# ');
INSERT 0 1
postgres=# \copy x to stdout csv
blah,helloworld,hmmm


 An alternative although I don't know what kind of headaches it would 
 cause is to have a text delimiter as well as a field delimter, e.g;

Postgresql already delimits CSV fields as required, so I'm not sure what
you're asking here...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Martijn van Oosterhout
On Wed, Jun 14, 2006 at 10:55:04AM -0400, Bruce Momjian wrote:
  Once again: give me a plausible use-case for pulling this data out of a
  pg_dump output file.  You'd need data-massaging code anyway just to
  extract the data, so why not expect that code to convert to CSV or
  whatever other format you might want?  If you can think of use-cases like
  this, why do you think the destination format is necessarily CSV and not
  something else?  If it is something else, adding a CSV option to pg_dump
  makes it *harder* not easier to write that massaging code, because now
  it's got to cope with N dump formats not one.
 
 I don't have to think of a use case.  I trusted the people who said we
 needed CSV, so I trust them again if they say doing pg_dump with CSV
 would be a good idea.

I think the point is that it's hard to imagine a use case for CSV
support in pg_dump that isn't already better served by using psql. If
you add it to pg_dump, people have to write special code to extract it
anyway, so why can't they handle the tab delimited themselves.

OTOH, if you use psql you get straightforward CSV wit no garbage at
the beginning or end, just straight, raw, CSV.

$ psql postgres -c '\copy pg_namespace to stdout csv'
pg_toast,10,
pg_temp_1,10,
pg_catalog,10,{kleptog=UC/kleptog,=U/kleptog}
public,10,{kleptog=UC/kleptog,=UC/kleptog}
information_schema,10,{kleptog=UC/kleptog,=U/kleptog}

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Greg Stark

Andrew Dunstan [EMAIL PROTECTED] writes:

 The CSV definition and its lack of formality is a complete red herring in
 this, as are references to Microsoft.

I think then you've missed the real issue.

 The real issue in what you say here is your suggestion that we might not be
 able to reproduce the input in some cases via a COPY CSV round trip. 

No that's not the suggestion. The question is: what do you do when the next
version of Excel comes out and can't parse your CSV files any more? If you
change your format to match then you won't be able to parse old pg_dump files
any more. You'll end up with an option for each csv variant and then how does
the user know which option to use to read a given pg_dump archive?

And before you ask, yes, virtually every version of Excel has changed its csv
file format.

I think the problem here is that pg_dump is serving double duty as a
postgres-postgres tool and some people using it as a kind of batch COPY. What
benefit does it buy you over a script that runs COPY for each table if that's
what you want?

-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Andrew Dunstan



Bruce Momjian wrote:



As was stated before, the use-case for this is by people we don't
normally have contact with.

 



I do think we need a use case for what we do.

The main use case seems to me to be where you are exporting a whole 
database or most of it with a very large number of tables, and it is 
convenient to have all the CSVs created for you rather than have to make 
them manually one at a time. You could get these out of, say, a tar 
format dump very easily.


I said near the beginning of this that a pgfoundry project to create a 
tool for this might be an alternative way to go. If that's the consensus 
then Ok. I just bristle a bit at the suggestion that we might not get 
back what we started with from a CSV dump, because we can, AFAIK.


cheers

andrew



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Bruce Momjian
Andrew Dunstan wrote:
 
 
 Bruce Momjian wrote:
 
 
 As was stated before, the use-case for this is by people we don't
 normally have contact with.
 
   
 
 
 I do think we need a use case for what we do.
 
 The main use case seems to me to be where you are exporting a whole 
 database or most of it with a very large number of tables, and it is 
 convenient to have all the CSVs created for you rather than have to make 
 them manually one at a time. You could get these out of, say, a tar 
 format dump very easily.
 
 I said near the beginning of this that a pgfoundry project to create a 
 tool for this might be an alternative way to go. If that's the consensus 
 then Ok. I just bristle a bit at the suggestion that we might not get 
 back what we started with from a CSV dump, because we can, AFAIK.

For me, the use case would be, what format do I want a dump in if it is
for long-term storage?  Do I want it in a PostgreSQL-native format, or
in a more universal format that can be loaded into PostgreSQL tomorrow,
and perhaps loaded into some other database, with modification, ten
years from now.

I just had that issue on my home system for file system backups, going
from cpio to ustar (POSIX.1-1988 / IEEE Std1003.2), but it seems that
POSIX.1-2001 would be best if my operating system supported it. 
(Perhaps cpio was better?)

Anyway, I never thought there would be a large demand for COPY CSV, but
obviously there is, so I have concluded that other people's environment
and skills are different enough from my own that I am willing to accept
the idea there is a use case when I don't understand it.  I will let the
people who work in those environments make that decision.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Andrew Dunstan



Greg Stark wrote:


And before you ask, yes, virtually every version of Excel has changed its csv
file format.


 



I'd be interested to hear of Excel variants we haven't catered for - our 
CSV parsing is some of the most flexible and complete around, IMNSHO.  
BTW, I suspect we won't see too much movement on this front in future, 
given the moves on the XML front.


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CSV mode option for pg_dump

2006-06-14 Thread Andrew Dunstan


I wrote:



The main use case seems to me to be where you are exporting a whole 
database or most of it with a very large number of tables, and it is 
convenient to have all the CSVs created for you rather than have to 
make them manually one at a time. You could get these out of, say, a 
tar format dump very easily.





I just noticed that the data members all have \. and some blank lines at 
the end, so wash that out.


We now return you to normal -hacking

cheers

andrew


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Matthew T. O'Connor

Bill Bartlett wrote:

Can't -- the main production database is over at a CoLo site with access
only available via SSH, and tightly-restricted SSH at that. Generally
one of the developers will SSH over to the server, pull out whatever
data is needed into a text file via psql or pg_dump, scp the file(s)
back here and send them to the user.


ODBC over an SSH tunnnel?

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Andrew Dunstan
Matthew T. OConnor said:
 Bill Bartlett wrote:
 Can't -- the main production database is over at a CoLo site with
 access only available via SSH, and tightly-restricted SSH at that.
 Generally one of the developers will SSH over to the server, pull out
 whatever data is needed into a text file via psql or pg_dump, scp the
 file(s) back here and send them to the user.

 ODBC over an SSH tunnnel?


I wish I could understand why people are so keen to make other people turn
handsprings in order to avoid a feature which, as Bruce points out, is
already on the TODO list, and which, by my 10 minute analysis, would involve
almost trivial code impact and risk. If this involved major impact I might
understand, but it really doesn't.

I know many people work in a Postgres only world. I wish everybody did, and
then we could just forget about things like CSV. They don't, so we can't.

I think I have said this before, but I'll say it again. From time to time
people thank me for things I have done for Postgres. The two things that
stand out BY FAR on the list of these are CSV import/export and dollar
quoting. This is a widely used feature.

cheers

andrew




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Bruce Momjian
Andrew Dunstan wrote:
 Matthew T. OConnor said:
  Bill Bartlett wrote:
  Can't -- the main production database is over at a CoLo site with
  access only available via SSH, and tightly-restricted SSH at that.
  Generally one of the developers will SSH over to the server, pull out
  whatever data is needed into a text file via psql or pg_dump, scp the
  file(s) back here and send them to the user.
 
  ODBC over an SSH tunnnel?
 
 
 I wish I could understand why people are so keen to make other people turn
 handsprings in order to avoid a feature which, as Bruce points out, is
 already on the TODO list, and which, by my 10 minute analysis, would involve
 almost trivial code impact and risk. If this involved major impact I might
 understand, but it really doesn't.
 
 I know many people work in a Postgres only world. I wish everybody did, and
 then we could just forget about things like CSV. They don't, so we can't.
 
 I think I have said this before, but I'll say it again. From time to time
 people thank me for things I have done for Postgres. The two things that
 stand out BY FAR on the list of these are CSV import/export and dollar
 quoting. This is a widely used feature.

I think the bottom line is that ease of use isn't as high enough on
the project's priority list as you (and others) think it should be.

I personally feel as you do that we should value ease of use more.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 I wish I could understand why people are so keen to make other people turn
 handsprings in order to avoid a feature which, as Bruce points out, is
 already on the TODO list, and which, by my 10 minute analysis, would involve
 almost trivial code impact and risk. If this involved major impact I might
 understand, but it really doesn't.

Supporting all of the CSV options in pg_dump would involve major bloat
in its option set, and it already has far too many options.  If it were
just a matter of adding a --csv switch I wouldn't be complaining, but
there are half a dozen more sub-options, and it seems like every time we
turn around someone is finding a reason for another one.  Propagating
all that cruft through pg_dump would be a PITA, and updating it to track
future additions would be too.

Furthermore, the entire rationale for the feature is predicated on the
claim that programs other than pg_restore might find it useful.  But
this conveniently ignores the fact that if there are any such programs
in existence, what this will really do is BREAK them, because they won't
be able to cope with all the variants that pass for CSV.

My opinions would be less negative if I thought that CSV were a
well-defined format that would never change.  I don't believe that it
has either property, however, and so I'm against letting it get into our
dump file format.  I think we'll just live to regret it if we do.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Bruce Momjian

Good point.  The number of CSV options would be hard to support for
pg_dump.  Any thoughts from anyone on how to do that cleanly?  Could we
just support the default behavior?

---

Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  I wish I could understand why people are so keen to make other people turn
  handsprings in order to avoid a feature which, as Bruce points out, is
  already on the TODO list, and which, by my 10 minute analysis, would involve
  almost trivial code impact and risk. If this involved major impact I might
  understand, but it really doesn't.
 
 Supporting all of the CSV options in pg_dump would involve major bloat
 in its option set, and it already has far too many options.  If it were
 just a matter of adding a --csv switch I wouldn't be complaining, but
 there are half a dozen more sub-options, and it seems like every time we
 turn around someone is finding a reason for another one.  Propagating
 all that cruft through pg_dump would be a PITA, and updating it to track
 future additions would be too.
 
 Furthermore, the entire rationale for the feature is predicated on the
 claim that programs other than pg_restore might find it useful.  But
 this conveniently ignores the fact that if there are any such programs
 in existence, what this will really do is BREAK them, because they won't
 be able to cope with all the variants that pass for CSV.
 
 My opinions would be less negative if I thought that CSV were a
 well-defined format that would never change.  I don't believe that it
 has either property, however, and so I'm against letting it get into our
 dump file format.  I think we'll just live to regret it if we do.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Martijn van Oosterhout
On Tue, Jun 13, 2006 at 10:20:53AM -0400, Bruce Momjian wrote:
 
 Good point.  The number of CSV options would be hard to support for
 pg_dump.  Any thoughts from anyone on how to do that cleanly?  Could we
 just support the default behavior?

What this tells me is that we need a tool somewhere between psql and
pg_dump, say, pgquery. It's sole purpose in life is to generate output
from various queries. Because it's a seperate tool there's no question
of psql or pg_dump being able to parse them.

While you're at it, you could add modules to support many different
output styles, like CSV, XML, Excel format, HTML, etc.

This I beleive would take the load off psql to provide many different
output styles, as well as the load off pg_dump to produce
parsable-by-third-party output.

Thoughts?

Side note: In my experience Excel happily slurps up tab delimited
output, so I'm not sure why all of this is an issue in the first place.

Have a ncie day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Volkan YAZICI
On Jun 13 10:20, Bruce Momjian wrote:
 
 Good point.  The number of CSV options would be hard to support for
 pg_dump.  Any thoughts from anyone on how to do that cleanly?  Could we
 just support the default behavior?

IMHO, it might be better if we'd support a syntax like

  pg_dump --csv=opt0,para0:opt2,opt3

This can save us from the pg_dump parameter pollution a little bit.

Furthermore, I think CSV format for the dump files can be maintained
better under an external project. (pgFoundry?) By this way, main
developers will be able to cope with their own core problems while
other users/developers can contribute on the CSV code easily. And if
any user will ever want to get CSV functionality in the pg_dump,
he/she will just issue a --csv parameter (with the above syntax) and
pg_dump will make a suitable dlopen() call for the related (CSV)
module. Anyway, this is just an idea for modularity; but the main
thing I try to underline is to give pg_dump a module functionality for
similar problems.


Regards.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Andrew Dunstan



Tom Lane wrote:


there are half a dozen more sub-options, and it seems like every time we
turn around someone is finding a reason for another one.  
 



This is a bit unfair. The feature was introduced in 8.0, and slightly 
enhanced in 8.1. There have not been any additional CSV features this 
release cycle unless my memory is worse than I thought, and I at least 
have said previously that I will be resistant to the addition of further 
CSV options.


My thoughts regarding options for pg_dump was actually to provide a much 
smaller set than the full set available with COPY, specifically to 
provide for using a single rather than a double quote char, and optional 
header lines - no alternate escape or delimiter, and no FORCE QUOTE  
(FORCE NOT NULL isn't relevant as it is only useful for non-postgres 
derived data). At least that would be a reasonable starting point, and 
would I believe cater for the vast majority of uses.


cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Joshua D. Drake

Bruce Momjian wrote:

Good point.  The number of CSV options would be hard to support for
pg_dump.  Any thoughts from anyone on how to do that cleanly?  Could we
just support the default behavior?


Although I don't see a real need for the feature, I do think that if we 
were to support 1 (well two if you include the already tab delimited) 
csv output it would be a large amount of bloat.


Perhaps we could pick 1 output, say comma delimted with quoted fields?

foo,bar   ,baz

Joshua D. Drake



---

Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:

I wish I could understand why people are so keen to make other people turn
handsprings in order to avoid a feature which, as Bruce points out, is
already on the TODO list, and which, by my 10 minute analysis, would involve
almost trivial code impact and risk. If this involved major impact I might
understand, but it really doesn't.

Supporting all of the CSV options in pg_dump would involve major bloat
in its option set, and it already has far too many options.  If it were
just a matter of adding a --csv switch I wouldn't be complaining, but
there are half a dozen more sub-options, and it seems like every time we
turn around someone is finding a reason for another one.  Propagating
all that cruft through pg_dump would be a PITA, and updating it to track
future additions would be too.

Furthermore, the entire rationale for the feature is predicated on the
claim that programs other than pg_restore might find it useful.  But
this conveniently ignores the fact that if there are any such programs
in existence, what this will really do is BREAK them, because they won't
be able to cope with all the variants that pass for CSV.

My opinions would be less negative if I thought that CSV were a
well-defined format that would never change.  I don't believe that it
has either property, however, and so I'm against letting it get into our
dump file format.  I think we'll just live to regret it if we do.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match






--

=== The PostgreSQL Company: Command Prompt, Inc. ===
  Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
  Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Andrew Dunstan



Volkan YAZICI wrote:


On Jun 13 10:20, Bruce Momjian wrote:
 


Good point.  The number of CSV options would be hard to support for
pg_dump.  Any thoughts from anyone on how to do that cleanly?  Could we
just support the default behavior?
   



IMHO, it might be better if we'd support a syntax like

 pg_dump --csv=opt0,para0:opt2,opt3

This can save us from the pg_dump parameter pollution a little bit.

Furthermore, I think CSV format for the dump files can be maintained
better under an external project. (pgFoundry?) By this way, main
developers will be able to cope with their own core problems while
other users/developers can contribute on the CSV code easily. And if
any user will ever want to get CSV functionality in the pg_dump,
he/she will just issue a --csv parameter (with the above syntax) and
pg_dump will make a suitable dlopen() call for the related (CSV)
module. Anyway, this is just an idea for modularity; but the main
thing I try to underline is to give pg_dump a module functionality for
similar problems.

 



There are some problems with this, though:

. FORCE QUOTE is table specific, and COPY will barf if you name a column 
that isn't on the table. Providing for this option would involve lots 
more code in pg_dump, as we'd have to filter the list according to the 
column names in each table.


. specifying arbitrary chars for quote, escape and delimiter could be 
tricky from the command line, especially if you want to specify a tab 
delimiter or backslash escape.


cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Andrew Dunstan



Martijn van Oosterhout wrote:



Side note: In my experience Excel happily slurps up tab delimited
output, so I'm not sure why all of this is an issue in the first place.

 



I guess you experience doesn't run to data that has embedded tabs, for 
example.


There really is a reason we did this in the first place, and it wasn't 
for fun.


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Rod Taylor
On Mon, 2006-06-12 at 16:28 -0400, Bill Bartlett wrote:
 Can't -- the main production database is over at a CoLo site with access
 only available via SSH, and tightly-restricted SSH at that. Generally
 one of the developers will SSH over to the server, pull out whatever
 data is needed into a text file via psql or pg_dump, scp the file(s)
 back here and send them to the user.

I don't get it. If you can use psql then you already have csv support.

psql -c 'COPY pg_class TO STDOUT WITH CSV' postgres  pg_class.csv

  -Original Message-
  From: Joshua D. Drake [mailto:[EMAIL PROTECTED] 
  Sent: Monday, June 12, 2006 4:15 PM
  To: Bill Bartlett
  Cc: 'Andrew Dunstan'; 'Tom Lane'; 'PG Hackers'
  Subject: Re: [HACKERS] CSV mode option for pg_dump
  
  
  Bill Bartlett wrote:
   Here's me speaking up -- I'd definitely use it!   As a 
  quick way to pull
   data into Excel to do basic reports or analysis, a CSV 
  format would be 
   great.
  
  Why not just use ODBC?
  
  Joshua D. Drake
  -- 
  
   === The PostgreSQL Company: Command Prompt, Inc. ===
 Sales/Support: +1.503.667.4564 || 24x7/Emergency: 
  +1.800.492.2240
 Providing the most comprehensive  PostgreSQL solutions 
  since 1997
http://www.commandprompt.com/
  
  
 
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
 
-- 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Steve Atkins


On Jun 13, 2006, at 7:34 AM, Martijn van Oosterhout wrote:


What this tells me is that we need a tool somewhere between psql and
pg_dump, say, pgquery. It's sole purpose in life is to generate output
from various queries. Because it's a seperate tool there's no question
of psql or pg_dump being able to parse them.

While you're at it, you could add modules to support many different
output styles, like CSV, XML, Excel format, HTML, etc.

This I beleive would take the load off psql to provide many different
output styles, as well as the load off pg_dump to produce
parsable-by-third-party output.

Thoughts?


Perl+DBD::Pg+CPAN does almost all of this already. Lots of support
for countless different output formats, and mostly fairly well battle- 
tested.


I suspect that a perl script to do all that would be dominated by  
commandline

option parsing, as all the hard work is in existing modules.

Would that be adequate, or do we really want to reimplement and  
maintain all

the output format complexity in our own code, in C?

Cheers,
  Steve

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Joshua D. Drake

Rod Taylor wrote:

On Mon, 2006-06-12 at 16:28 -0400, Bill Bartlett wrote:

Can't -- the main production database is over at a CoLo site with access
only available via SSH, and tightly-restricted SSH at that. Generally
one of the developers will SSH over to the server, pull out whatever
data is needed into a text file via psql or pg_dump, scp the file(s)
back here and send them to the user.


I don't get it. If you can use psql then you already have csv support.

psql -c 'COPY pg_class TO STDOUT WITH CSV' postgres  pg_class.csv


If you data looks like this:

foo barbaz  bing

You are o.k. You have three columns, tab delimited.

However if you data looks like this:

foo bar baz bing

You have a problem.

foo is one column
bar and baz are a single column
bing is a single column

How does excel know that bar	baz is a single column? It doesn't because 
you told it to delimit on tabs and thus you have four columns as far as 
Excel is concerned.


An alternative although I don't know what kind of headaches it would 
cause is to have a text delimiter as well as a field delimter, e.g;


foo bar   baz   bing

Sincerely,

Joshua D. Drake







-Original Message-
From: Joshua D. Drake [mailto:[EMAIL PROTECTED] 
Sent: Monday, June 12, 2006 4:15 PM

To: Bill Bartlett
Cc: 'Andrew Dunstan'; 'Tom Lane'; 'PG Hackers'
Subject: Re: [HACKERS] CSV mode option for pg_dump


Bill Bartlett wrote:
Here's me speaking up -- I'd definitely use it!   As a 

quick way to pull
data into Excel to do basic reports or analysis, a CSV 
format would be 

great.

Why not just use ODBC?

Joshua D. Drake
--

 === The PostgreSQL Company: Command Prompt, Inc. ===
   Sales/Support: +1.503.667.4564 || 24x7/Emergency: 
+1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions 
since 1997

  http://www.commandprompt.com/




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly




--

=== The PostgreSQL Company: Command Prompt, Inc. ===
  Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
  Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread PFC


From what I gather, the CSV format dump would only contain data.
	I think pg_dump is the friend of pg_restore. It dumps everything  
including user defined functions, types, schemas etc. CSV does not fit  
with this.


	Besides, people will probably want to dump into CSV the result of any  
query, to load it into excel, not just the full contents of a table.


	So, why not create a separate tool, someone suggested pg_query for that,  
I second it.
	This tool would take a query and format options, and would output a file  
in whatever format chosen by the user (CSV, COPY format, xml, whatever)


	A script language (python) can be used, which will significantly shorten  
development times and allow easy modularity, as it is easier to add a  
module to a python program than a C program.
	I would vote for Python because I love it and it has a very good postgres  
adapter (psycopg2) which knows how to convers every postgres type to a  
native language type (yes, even multidimensional arrays of BOX get  
converted). And it's really fast at retrieving large volumes of data.


	So you have a stable, fast tool for backup and restore (pg_dump) and a  
rapidly evolving, user-friendly and extendable tool for exporting data,  
and everyone is happy.


	Mr Momijan talks about adding modular functionality to pg_dump. Is it  
really necessary ? What is the objective ? Is it to reuse code in pg_dump  
? I guess not ; if a user wants to dump, for instance, all the tables in a  
schema, implementing this logic in python is only a few lines of code  
(select from information_schema...)


	To be realistic, output format modules should be written in script  
languages. Noone sane is eager to do string manipulation in C. Thus these  
modules would have to somehow fit with pg_dump, maybe with a pipe or  
something. This means designing another protocol. Reimplementing in a  
scripting langage the parts of pg_dump which will be reused by this  
project (mainly, enumerating tables and stuff) will be far easier.


Just look.

Python 2.4.2 (#1, Mar 30 2006, 14:34:35)
[GCC 3.4.4 (Gentoo 3.4.4-r1, ssp-3.4.4-1.0, pie-8.7.8)] on linux2
Type help, copyright, credits or license for more information.

...opens a db connection...


c.execute( SELECT * FROM test.csv )
data = c.fetchall()
data
[[1, datetime.date(2006, 6, 13), 'this\tcontains\ttabulations'], [2,  
datetime.date(2006, 6, 13), this'contains'quotes], [3,  
datetime.date(2006, 6, 13), 'thiscontainsdouble quotes']]

import csv, sys
c = csv.writer( sys.stdout, dialect = csv.excel )
c.writerows( data )

1,2006-06-13,this   containstabulations
2,2006-06-13,this'contains'quotes
3,2006-06-13,thiscontainsdouble quotes

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Andrew Dunstan



PFC wrote:



From what I gather, the CSV format dump would only contain data.
I think pg_dump is the friend of pg_restore. It dumps everything  
including user defined functions, types, schemas etc. CSV does not 
fit  with this.


   



This is just nonsense. There is not the slightest reason that CSV data 
cannot be embedded in a text dump nor exist as the data members of a 
custom or tar dump with the corresponding COPY statements modified 
accordingly.


Really, let's get the facts straight, please.

cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Joshua D. Drake


Would that be adequate, or do we really want to reimplement and maintain 
all

the output format complexity in our own code, in C?


I think the point is that we should provide a native implementation 
because not everyone is crazy enough to use perl (blatant jab ;)). I 
would never expect a customer to write a perl or python script just to 
get their data in what is widely considered a standard business format 
that can be imported by their userland application.


The people on the hackers list, are NOT the target for this feature. The 
people on general, admin and novice are.


Sincerely,

Joshua D. Drake






Cheers,
  Steve

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org




--

=== The PostgreSQL Company: Command Prompt, Inc. ===
  Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
  Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Bill Bartlett
 From: Rod Taylor [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, June 13, 2006 11:31 AM
 
 
 On Mon, 2006-06-12 at 16:28 -0400, Bill Bartlett wrote:
  Can't -- the main production database is over at a CoLo site with 
  access only available via SSH, and tightly-restricted SSH at that. 
  Generally one of the developers will SSH over to the 
 server, pull out 
  whatever data is needed into a text file via psql or 
 pg_dump, scp the 
  file(s) back here and send them to the user.
 
 I don't get it. If you can use psql then you already have csv support.
 
 psql -c 'COPY pg_class TO STDOUT WITH CSV' postgres  pg_class.csv

Ah - RTF-UPDATED-M on my part.  Most of my systems are still in PG 7.4.x
databases so I tend to stick with the 7.x docs, and I didn't notice the
WITH CSV option added in 8.0.  That, plus temp tables, will be very
useful.

However, I also agree with the need for a new pg_query / pg_export
program. A program geared solely towards exporting the results of a
query would allow many of the options that are needed for the
ever-growing variety of output formats (XML, CSV, HTML, XHTML, etc.) and
details for each format without needing to clutter up pg_dump with
things that really having nothing to do with backing up and restoring
data.  It could also allow a large range of options related to getting
data out (where, order by), many of which have also been discussed for
pg_dump recently.

- Bill

 
   -Original Message-
   From: Joshua D. Drake [mailto:[EMAIL PROTECTED]
   Sent: Monday, June 12, 2006 4:15 PM
   To: Bill Bartlett
   Cc: 'Andrew Dunstan'; 'Tom Lane'; 'PG Hackers'
   Subject: Re: [HACKERS] CSV mode option for pg_dump
   
   
   Bill Bartlett wrote:
Here's me speaking up -- I'd definitely use it!   As a 
   quick way to pull
data into Excel to do basic reports or analysis, a CSV
   format would be
great.
   
   Why not just use ODBC?
   
   Joshua D. Drake
   --
   
=== The PostgreSQL Company: Command Prompt, Inc. ===
  Sales/Support: +1.503.667.4564 || 24x7/Emergency:
   +1.800.492.2240
  Providing the most comprehensive  PostgreSQL solutions
   since 1997
 http://www.commandprompt.com/
   
   
 -- 
 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Steve Atkins


On Jun 13, 2006, at 9:47 AM, Joshua D. Drake wrote:

Would that be adequate, or do we really want to reimplement and  
maintain all

the output format complexity in our own code, in C?


I think the point is that we should provide a native implementation  
because not everyone is crazy enough to use perl (blatant jab ;)).  
I would never expect a customer to write a perl or python script  
just to get their data in what is widely considered a standard  
business format that can be imported by their userland application.


That wasn't what I was discussing, on two levels. Firstly, I wasn't  
suggesting that the end user write anything, secondly I was talking  
about the other output formats discussed (Excel, HTML...) rather than  
just CSV.




The people on the hackers list, are NOT the target for this  
feature. The people on general, admin and novice are.


I was referring to the other export formats mentioned (Excel,  
HTML...). We already support CSV export of single tables with the  
bundled software, don't we?


Cheers,
  Steve

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Andrew Dunstan



Steve Atkins wrote:



Would that be adequate, or do we really want to reimplement and  
maintain all

the output format complexity in our own code, in C?




Code to produce CSVs is there, now, today, and has been since 8.0.  That 
is *not* what is at issue here. If you want to debate whether or not it 
should be in the backend you are 2 years too late. The ONLY question 
here is about whether or not to have it enabled as an option in pg_dump.


try these and see the interesting results:
 COPY pg_class TO '/tmp/pg_class.csv' CSV;
or in psql
 \copy pg_class to '/tmp/pg_class.csv' csv

As for XML which was also mentioned, you should be aware that there is a 
Google Summer of Code project to implement SQL/XML, so watch this space. 
(As for why that needs to be in the backend, see previous discussions)


cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Bruce Momjian

pg_dump CSV TODO item removed until we come up with something everyone
can agree on.

---

Joshua D. Drake wrote:
  
  Would that be adequate, or do we really want to reimplement and maintain 
  all
  the output format complexity in our own code, in C?
 
 I think the point is that we should provide a native implementation 
 because not everyone is crazy enough to use perl (blatant jab ;)). I 
 would never expect a customer to write a perl or python script just to 
 get their data in what is widely considered a standard business format 
 that can be imported by their userland application.
 
 The people on the hackers list, are NOT the target for this feature. The 
 people on general, admin and novice are.
 
 Sincerely,
 
 Joshua D. Drake
 
 
 
 
  
  Cheers,
Steve
  
  ---(end of broadcast)---
  TIP 4: Have you searched our list archives?
  
http://archives.postgresql.org
  
 
 
 -- 
 
  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
   http://www.commandprompt.com/
 
 
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings
 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Andrew Dunstan



Bruce Momjian wrote:


pg_dump CSV TODO item removed until we come up with something everyone
can agree on.

 



That's a pity.

Just to show you how little is involved in what I was suggesting, a 
prototype patch is attached - it's 182 lines of context diff, which is 
pretty small for a new feature. It took me about an hour to write and I 
have tested it against the regression db in both text and binary dump 
modes, where it works without a hitch.


cheers

andrew


Index: src/bin/pg_dump/pg_dump.c
===
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dump.c,v
retrieving revision 1.438
diff -c -r1.438 pg_dump.c
*** src/bin/pg_dump/pg_dump.c	9 Jun 2006 19:46:09 -	1.438
--- src/bin/pg_dump/pg_dump.c	14 Jun 2006 00:32:03 -
***
*** 113,118 
--- 113,123 
  /* flag to turn on/off dollar quoting */
  static int	disable_dollar_quoting = 0;
  
+ /* flag to control if using CSv */
+ static bool use_csv = false;
+ 
+ /* holder for CSV options */
+ static PQExpBuffer csv_opts;
  
  static void help(const char *progname);
  static NamespaceInfo *findNamespace(Oid nsoid, Oid objoid);
***
*** 251,256 
--- 256,265 
  		{disable-triggers, no_argument, disable_triggers, 1},
  		{use-set-session-authorization, no_argument, use_setsessauth, 1},
  
+ 		/* long options with no short version */
+ 		{csv, no_argument, NULL ,2},
+ 		{csv-option,required_argument, NULL, 3},
+ 
  		{NULL, 0, NULL, 0}
  	};
  	int			optindex;
***
*** 285,290 
--- 294,301 
  		}
  	}
  
+ 	csv_opts = createPQExpBuffer();
+ 
  	while ((c = getopt_long(argc, argv, abcCdDE:f:F:h:in:oOp:RsS:t:uU:vWxX:Z:,
  			long_options, optindex)) != -1)
  	{
***
*** 419,424 
--- 430,462 
  break;
  /* This covers the long options equivalent to -X xxx. */
  
+ 			case 2:  /* csv */
+ use_csv = true;
+ break;
+ 
+ 			case 3: /* csv-option */
+ if (strcmp(optarg, singlequote) == 0)
+ {
+ 	appendPQExpBuffer(csv_opts,QUOTE AS  );
+ }
+ else if (strcmp(optarg, tabdelimiter) == 0)
+ {
+ 	appendPQExpBuffer(csv_opts,DELIMITER AS E'\\t' );
+ }
+ else if (strcmp(optarg, header) == 0)
+ {
+ 	appendPQExpBuffer(csv_opts,HEADER );
+ }
+ else
+ {
+ 	fprintf(stderr,
+ 			_(%s: invalid csv option -- %s\n),
+ 			progname, optarg);
+ 	fprintf(stderr, _(Try \%s --help\ for more information.\n), progname);
+ 	exit(1);
+ }
+ 	
+ 
  			case 0:
  break;
  
***
*** 463,468 
--- 501,518 
  		exit(1);
  	}
  
+ 	if ( use_csv == true  dumpInserts == true)
+ 	{
+ 		write_msg(NULL, INSERT (-d, -D) and CSV (--csv) options cannot be used together\n);
+ 		exit(1);
+ 	}
+ 
+ 	if ( use_csv == false  strlen(csv_opts-data)  0)
+ 	{
+ 		write_msg(NULL, You must specify --csv to use --csv-option\n);
+ 		exit(1);
+ 	}
+ 
  	/* open the output file */
  	switch (format[0])
  	{
***
*** 714,719 
--- 764,771 
  			use SESSION AUTHORIZATION commands instead of\n
  			OWNER TO commands\n));
  
+ 	printf(_(  --csvuse CSV mode\n));
+ 	printf(_(  --csv-option=opt one of header, tabdelimiter or singlequote\n));
  	printf(_(\nConnection options:\n));
  	printf(_(  -h, --host=HOSTNAME  database server host or socket directory\n));
  	printf(_(  -p, --port=PORT  database server port number\n));
***
*** 881,898 
  
  	if (oids  hasoids)
  	{
! 		appendPQExpBuffer(q, COPY %s %s WITH OIDS TO stdout;,
  		  fmtQualifiedId(tbinfo-dobj.namespace-dobj.name,
  		 classname),
  		  column_list);
  	}
  	else
  	{
! 		appendPQExpBuffer(q, COPY %s %s TO stdout;,
  		  fmtQualifiedId(tbinfo-dobj.namespace-dobj.name,
  		 classname),
  		  column_list);
  	}
  	res = PQexec(g_conn, q-data);
  	check_sql_result(res, g_conn, q-data, PGRES_COPY_OUT);
  	PQclear(res);
--- 933,955 
  
  	if (oids  hasoids)
  	{
! 		appendPQExpBuffer(q, COPY %s %s WITH OIDS TO stdout,
  		  fmtQualifiedId(tbinfo-dobj.namespace-dobj.name,
  		 classname),
  		  column_list);
  	}
  	else
  	{
! 		appendPQExpBuffer(q, COPY %s %s TO stdout,
  		  fmtQualifiedId(tbinfo-dobj.namespace-dobj.name,
  		 classname),
  		  column_list);
  	}
+ 	if (use_csv)
+ 	{
+ 		appendPQExpBuffer(q,  CSV %s, csv_opts-data);
+ 	}
+ 	appendPQExpBuffer(q, ;);
  	res = PQexec(g_conn, q-data);
  	check_sql_result(res, g_conn, q-data, PGRES_COPY_OUT);
  	PQclear(res);
***
*** 1139,1147 
  		/* must use 2 steps here 'cause fmtId is nonreentrant */
  		appendPQExpBuffer(copyBuf, COPY %s ,
  		  fmtId(tbinfo-dobj.name));
! 		appendPQExpBuffer(copyBuf, %s %sFROM stdin;\n,
  		  fmtCopyColumnList(tbinfo),
  	  (tdinfo-oids  tbinfo-hasoids) ? WITH OIDS  : );
  		copyStmt = copyBuf-data;
 

Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 This is just nonsense. There is not the slightest reason that CSV data 
 cannot be embedded in a text dump nor exist as the data members of a 
 custom or tar dump with the corresponding COPY statements modified 
 accordingly.

Well, the really *core* question here is whether we trust the stability
of the CSV format definition (and code) enough to want to rely on it for
data dump/restore purposes.  I'm still a few years away from that,
myself.  AFAICT the raison d'etre of the CSV code is emit whatever it
takes to satisfy this, that, and the other broken Microsoft application.
That's fine as an export tool, but as a dump/reload tool, nyet.  If you
put it in pg_dump you're just handing neophytes another foot-gun.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CSV mode option for pg_dump

2006-06-12 Thread Joshua D. Drake

Andrew Dunstan wrote:


Something someone said on IRC just now triggered a little memory  ... I 
think we should provide an option to have pg_dump work in CSV mode 
rather than text mode. This probably doesn't have much importance in the 
case of text dumps, but in custom or tar dumps where you might want to 
get at individual data members, having an option for CSVs that you want 
to load into some other product might be nice.


This should be a pretty low cost item, I expect (good newbie project?)


Uhh... just about any application that can import CSV can import our 
dumps. It just tell it the delimiter is a tab.


Joshua D. Drake



thoughts?

cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match




--

=== The PostgreSQL Company: Command Prompt, Inc. ===
  Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
  Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CSV mode option for pg_dump

2006-06-12 Thread Andrew Dunstan

Joshua D. Drake wrote:

Andrew Dunstan wrote:


Something someone said on IRC just now triggered a little memory  ... 
I think we should provide an option to have pg_dump work in CSV mode 
rather than text mode. This probably doesn't have much importance in 
the case of text dumps, but in custom or tar dumps where you might 
want to get at individual data members, having an option for CSVs 
that you want to load into some other product might be nice.


This should be a pretty low cost item, I expect (good newbie project?)


Uhh... just about any application that can import CSV can import our 
dumps. It just tell it the delimiter is a tab.





No it won't, not if there are tabs in the data.

Why do you think we did CSV in the first place? Precisely because our 
dump does *not* work as a general export mechanism for arbitrary data.


cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] CSV mode option for pg_dump

2006-06-12 Thread Josh Berkus
Josh,

  This should be a pretty low cost item, I expect (good newbie project?)

 Uhh... just about any application that can import CSV can import our
 dumps. It just tell it the delimiter is a tab.

Actually, there was an Summer of Code applcation to do this, but with all 
the other nifty stuff it wasn't accepted -- partly because we weren't sure 
that we wanted a CSV mode for dumps, partly because this should be a 
weekend fix, not a 3-month project.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CSV mode option for pg_dump

2006-06-12 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Something someone said on IRC just now triggered a little memory  ... I 
 think we should provide an option to have pg_dump work in CSV mode 
 rather than text mode. This probably doesn't have much importance in the 
 case of text dumps, but in custom or tar dumps where you might want to 
 get at individual data members, having an option for CSVs that you want 
 to load into some other product might be nice.

This is silly.  You'd just COPY the particular table you want, not use
pg_dump.  pg_dump's already got an unreasonably large number of options
without adding ones that have essentially zero use.  Also, I think there
are sufficient grounds to worry about whether a CSV dump would always
reload correctly --- we already know that that's a poorly thought out
standard.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CSV mode option for pg_dump

2006-06-12 Thread Joshua D. Drake



No it won't, not if there are tabs in the data.



snipping noise

Hmmm then would just double quoting the data work? At least in OOCalc 
(and IIRC Excel) there is the ability to select a text delimiter.


Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
  Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
  Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CSV mode option for pg_dump

2006-06-12 Thread Andrew Dunstan

Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
  
Something someone said on IRC just now triggered a little memory  ... I 
think we should provide an option to have pg_dump work in CSV mode 
rather than text mode. This probably doesn't have much importance in the 
case of text dumps, but in custom or tar dumps where you might want to 
get at individual data members, having an option for CSVs that you want 
to load into some other product might be nice.



This is silly.  You'd just COPY the particular table you want, not use
pg_dump.  pg_dump's already got an unreasonably large number of options
without adding ones that have essentially zero use.  Also, I think there
are sufficient grounds to worry about whether a CSV dump would always
reload correctly --- we already know that that's a poorly thought out
standard.


  


Well, if you have dozens or hundreds of tables it might well be more 
convenient.


As for not reloading - I went to some trouble to make sure that we could 
reload what we dumped, exactly, unless the force options are used. I 
might have made a bug in that, but it isn't dependent on the particular 
CSV format used.


Naturally you won't have a use for it, but I suspect others might (in 
which case they had better speak up ;-) )


I suppose the alternative would be to write a little tool in perl or 
whatever to do the same thing for you. Maybe a good pgfoundry project.


cheers

andrew


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CSV mode option for pg_dump

2006-06-12 Thread Jim C. Nasby
On Mon, Jun 12, 2006 at 02:15:59PM -0400, Andrew Dunstan wrote:
 Naturally you won't have a use for it, but I suspect others might (in 
 which case they had better speak up ;-) )

I'd bet that those who would find this useful are far more likely to be
on -general and not in here.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CSV mode option for pg_dump

2006-06-12 Thread Bill Bartlett
Here's me speaking up -- I'd definitely use it!   As a quick way to pull
data into Excel to do basic reports or analysis, a CSV format would be
great.  Some of our users currently pull data into Excel for quickie
analysis, but creating fixed-width data via psql requires them to parse
the data and dumping anything via pg_dump with any delimiter (tabs,
etc.) usually doesn't work due to the delimiters being embedded in the
real data.

- Bill

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Andrew Dunstan
 Sent: Monday, June 12, 2006 2:16 PM
 To: Tom Lane
 Cc: PG Hackers
 Subject: Re: [HACKERS] CSV mode option for pg_dump
 
 
 Tom Lane wrote:
  Andrew Dunstan [EMAIL PROTECTED] writes:

  Something someone said on IRC just now triggered a little 
 memory  ... 
  I
  think we should provide an option to have pg_dump work in CSV mode 
  rather than text mode. This probably doesn't have much 
 importance in the 
  case of text dumps, but in custom or tar dumps where you 
 might want to 
  get at individual data members, having an option for CSVs 
 that you want 
  to load into some other product might be nice.
  
 
  This is silly.  You'd just COPY the particular table you 
 want, not use 
  pg_dump.  pg_dump's already got an unreasonably large number of 
  options without adding ones that have essentially zero use. 
  Also, I 
  think there are sufficient grounds to worry about whether a 
 CSV dump 
  would always reload correctly --- we already know that 
 that's a poorly 
  thought out standard.
 
  

 
 Well, if you have dozens or hundreds of tables it might well be more 
 convenient.
 
 As for not reloading - I went to some trouble to make sure 
 that we could 
 reload what we dumped, exactly, unless the force options are used. I 
 might have made a bug in that, but it isn't dependent on the 
 particular 
 CSV format used.
 
 Naturally you won't have a use for it, but I suspect others might (in 
 which case they had better speak up ;-) )
 
 I suppose the alternative would be to write a little tool in perl or 
 whatever to do the same thing for you. Maybe a good pgfoundry project.
 
 cheers
 
 andrew
 
 
 ---(end of 
 broadcast)---
 TIP 5: don't forget to increase your free space map settings
 


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CSV mode option for pg_dump

2006-06-12 Thread Joshua D. Drake

Bill Bartlett wrote:

Here's me speaking up -- I'd definitely use it!   As a quick way to pull
data into Excel to do basic reports or analysis, a CSV format would be
great. 


Why not just use ODBC?

Joshua D. Drake
--

=== The PostgreSQL Company: Command Prompt, Inc. ===
  Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
  Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] CSV mode option for pg_dump

2006-06-12 Thread Bill Bartlett
Can't -- the main production database is over at a CoLo site with access
only available via SSH, and tightly-restricted SSH at that. Generally
one of the developers will SSH over to the server, pull out whatever
data is needed into a text file via psql or pg_dump, scp the file(s)
back here and send them to the user.

We're working on implementing a full-blown reporting system to eliminate
some of this need, but it's not quite there yet.

However, CSV is a very handy format to use to send data to other people
(especially less technical users), so even in the future the ability
to export to CSV would be handy.

- Bill

 -Original Message-
 From: Joshua D. Drake [mailto:[EMAIL PROTECTED] 
 Sent: Monday, June 12, 2006 4:15 PM
 To: Bill Bartlett
 Cc: 'Andrew Dunstan'; 'Tom Lane'; 'PG Hackers'
 Subject: Re: [HACKERS] CSV mode option for pg_dump
 
 
 Bill Bartlett wrote:
  Here's me speaking up -- I'd definitely use it!   As a 
 quick way to pull
  data into Excel to do basic reports or analysis, a CSV 
 format would be 
  great.
 
 Why not just use ODBC?
 
 Joshua D. Drake
 -- 
 
  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: 
 +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions 
 since 1997
   http://www.commandprompt.com/
 
 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CSV mode option for pg_dump

2006-06-12 Thread Bruce Momjian

Already on TODO:

pg_dump:

  o %Add CSV output format


---

Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  Something someone said on IRC just now triggered a little memory  ... I 
  think we should provide an option to have pg_dump work in CSV mode 
  rather than text mode. This probably doesn't have much importance in the 
  case of text dumps, but in custom or tar dumps where you might want to 
  get at individual data members, having an option for CSVs that you want 
  to load into some other product might be nice.
 
 This is silly.  You'd just COPY the particular table you want, not use
 pg_dump.  pg_dump's already got an unreasonably large number of options
 without adding ones that have essentially zero use.  Also, I think there
 are sufficient grounds to worry about whether a CSV dump would always
 reload correctly --- we already know that that's a poorly thought out
 standard.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match