Re: csv format for psql

2018-11-27 Thread Daniel Verite
Tom Lane wrote:

> what I did instead was just to make
> csv_print_field force field quoting if any of these cases could
> possibly apply.  That will result in excess quoting in some
> cases, but I think that's fine, since they're all pretty uncommon.

Indeed.

> (BTW, it seems only chance that the server's CSV mode isn't also
> subject to this case, but since it always quotes empty fields,
> I think we're OK there.)
> 
> Pushed with that correction and some other last-minute review.

Thanks!


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-11-26 Thread Pavel Stehule
>
> Pushed with that correction and some other last-minute review.
>

Great! Thank you

Pavel


>
> regards, tom lane
>
>


Re: csv format for psql

2018-11-26 Thread Tom Lane
"Daniel Verite"  writes:
>> A proposed fix is attached. print_csv_vertical() is left unchanged
>> because it's not possible currently to end up with \. alone
>> on a line with the expanded display

> On second thought, it is possible
> postgres=# \pset format csv
> Output format is csv.
> postgres=# \pset fieldsep_csv '.'
> Field separator for CSV is ".".
> postgres=# \x
> Expanded display is on.
> postgres=# select '' as "\" ;
> \.

Ugh.  There's more to that than meets the eye; consider also

\pset csv_fieldsep '.'
select '\' as d1, '' as d2;

or the reverse case where we set the fieldsep to '\' and then
emit a second field containing '.'.

I think that the approach you're using doesn't really scale to handle
all these cases sanely, so what I did instead was just to make
csv_print_field force field quoting if any of these cases could
possibly apply.  That will result in excess quoting in some
cases, but I think that's fine, since they're all pretty uncommon.

(BTW, it seems only chance that the server's CSV mode isn't also
subject to this case, but since it always quotes empty fields,
I think we're OK there.)

Pushed with that correction and some other last-minute review.

regards, tom lane



Re: csv format for psql

2018-11-26 Thread Daniel Verite
I wrote:

> A proposed fix is attached. print_csv_vertical() is left unchanged
> because it's not possible currently to end up with \. alone
> on a line with the expanded display

On second thought, it is possible

postgres=# \pset format csv
Output format is csv.

postgres=# \pset fieldsep_csv '.'
Field separator for CSV is ".".

postgres=# \x
Expanded display is on.

postgres=# select '' as "\" ;
\.

PFA an upgraded fix.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/fe_utils/print.c b/src/fe_utils/print.c
index 3b07280..d9c701f 100644
--- a/src/fe_utils/print.c
+++ b/src/fe_utils/print.c
@@ -1793,7 +1793,16 @@ print_csv_text(const printTableContent *cont, FILE *fout)
if (cont->opt->start_table && !cont->opt->tuples_only)
{
/* print headers */
-   for (ptr = cont->headers; *ptr; ptr++)
+
+   /*
+* A \. alone on a line must be quoted to be compatible with 
COPY CSV,
+* for which \. is an end-of-data marker.
+*/
+   if (cont->ncolumns == 1 && strcmp(*cont->headers, "\\.") == 0)
+   {
+   csv_escaped_print(*cont->headers, fout);
+   }
+   else for (ptr = cont->headers; *ptr; ptr++)
{
if (ptr != cont->headers)
fputs(cont->opt->fieldSepCsv, fout);
@@ -1805,7 +1814,14 @@ print_csv_text(const printTableContent *cont, FILE *fout)
/* print cells */
for (i = 0, ptr = cont->cells; *ptr; i++, ptr++)
{
-   csv_print_field(*ptr, fout, cont->opt->fieldSepCsv);
+   /*
+* A \. alone on a line must be quoted to be compatible with 
COPY CSV,
+* for which \. is an end-of-data marker.
+*/
+   if (cont->ncolumns == 1 && strcmp(*ptr, "\\.") == 0)
+   csv_escaped_print(*ptr, fout);
+   else
+   csv_print_field(*ptr, fout, cont->opt->fieldSepCsv);
if ((i + 1) % cont->ncolumns)
fputs(cont->opt->fieldSepCsv, fout);
else
@@ -1825,16 +1841,28 @@ print_csv_vertical(const printTableContent *cont, FILE 
*fout)
if (cancel_pressed)
return;
 
-   /* print name of column */
-   csv_print_field(cont->headers[i % cont->ncolumns], fout,
-   cont->opt->fieldSepCsv);
-
-   /* print field separator */
-   fputs(cont->opt->fieldSepCsv, fout);
+   /*
+* A \. alone on a line must be quoted to be compatible with 
COPY CSV,
+* for which \. is an end-of-data marker.
+*/
+   if (strcmp(cont->headers[i % cont->ncolumns], "\\") == 0
+   && strcmp(cont->opt->fieldSepCsv, ".") == 0
+   && **ptr == '\0')
+   {
+   csv_escaped_print("\\.", fout);
+   }
+   else
+   {
+   /* print name of column */
+   csv_print_field(cont->headers[i % cont->ncolumns], fout,
+   cont->opt->fieldSepCsv);
 
-   /* print field value */
-   csv_print_field(*ptr, fout, cont->opt->fieldSepCsv);
+   /* print field separator */
+   fputs(cont->opt->fieldSepCsv, fout);
 
+   /* print field value */
+   csv_print_field(*ptr, fout, cont->opt->fieldSepCsv);
+   }
fputc('\n', fout);
}
 }


Re: csv format for psql

2018-11-26 Thread Daniel Verite
Tom Lane wrote:

> OK, reasonable arguments were made why not to allow multi-character
> separators.  Should we then match the server and insist on a single-byte
> separator?  It's a bit inconsistent if psql can be made to emit "csv"
> files that COPY can't read, especially when it's otherwise a subset
> of what COPY allows.

I seem to be the only one advocating to accept an Unicode
character here, and I won't insist on it.

There's still a minor annoyance to solve if we want to claim full
compatibility with COPY CSV: backslash followed by dot must be
quoted to avoid being interpreted as an end of data indicator.

A proposed fix is attached. print_csv_vertical() is left unchanged
because it's not possible currently to end up with \. alone
on a line with the expanded display: we'd need to allow
first for an empty field separator, I believe.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/fe_utils/print.c b/src/fe_utils/print.c
index 3b07280..d31b6af 100644
--- a/src/fe_utils/print.c
+++ b/src/fe_utils/print.c
@@ -1793,7 +1793,16 @@ print_csv_text(const printTableContent *cont, FILE *fout)
if (cont->opt->start_table && !cont->opt->tuples_only)
{
/* print headers */
-   for (ptr = cont->headers; *ptr; ptr++)
+
+   /*
+* A \. alone on a line must be quoted to be compatible with 
COPY CSV,
+* for which \. is an end-of-data marker.
+*/
+   if (cont->ncolumns == 1 && strcmp(*cont->headers, "\\.") == 0)
+   {
+   csv_escaped_print(*cont->headers, fout);
+   }
+   else for (ptr = cont->headers; *ptr; ptr++)
{
if (ptr != cont->headers)
fputs(cont->opt->fieldSepCsv, fout);
@@ -1805,7 +1814,14 @@ print_csv_text(const printTableContent *cont, FILE *fout)
/* print cells */
for (i = 0, ptr = cont->cells; *ptr; i++, ptr++)
{
-   csv_print_field(*ptr, fout, cont->opt->fieldSepCsv);
+   /*
+* A \. alone on a line must be quoted to be compatible with 
COPY CSV,
+* for which \. is an end-of-data marker.
+*/
+   if (cont->ncolumns == 1 && strcmp(*ptr, "\\.") == 0)
+   csv_escaped_print(*ptr, fout);
+   else
+   csv_print_field(*ptr, fout, cont->opt->fieldSepCsv);
if ((i + 1) % cont->ncolumns)
fputs(cont->opt->fieldSepCsv, fout);
else


Re: csv format for psql

2018-11-26 Thread Tom Lane
"David G. Johnston"  writes:
> On Sun, Nov 25, 2018 at 6:27 PM Tom Lane  wrote:
>> 1. Are we limiting the separator to be a single-byte character or not?

> I agree with what others have said that expanding functionality in
> this direction is more likely to mask errors than be useful.

OK, reasonable arguments were made why not to allow multi-character
separators.  Should we then match the server and insist on a single-byte
separator?  It's a bit inconsistent if psql can be made to emit "csv"
files that COPY can't read, especially when it's otherwise a subset
of what COPY allows.

>> 2. Speaking of the field separator, I'm pretty desperately unhappy
>> with the choice of "fieldsep_csv" as the parameter name.[...]
>> We could avoid this self-inflicted confusion by choosing a different
>> parameter name.  I'd be good with "csv_fieldsep" or "csvfieldsep".

> Make sense to me - with the underscore personally.

Barring complaints, I'll switch it to "csv_fieldsep".

regards, tom lane



Re: csv format for psql

2018-11-26 Thread David G. Johnston
On Sun, Nov 25, 2018 at 6:27 PM Tom Lane  wrote:
> I think there are two remaining points to settle:
>
> 1. Are we limiting the separator to be a single-byte character or not?

I agree with what others have said that expanding functionality in
this direction is more likely to mask errors than be useful.

> I'm a bit inclined to the former viewpoint.
> If we were in the business of being restrictive, why would we allow
> the field separator to be changed at all?  The name of the format is
> *comma* separated values, not something else.

I still stand by the more inclusive, and arguably modern, name
"character separated values" for the abbreviation...which can be taken
to mean any single character quite easily and is how it appears to be
used these days in any case.

> 2. Speaking of the field separator, I'm pretty desperately unhappy
> with the choice of "fieldsep_csv" as the parameter name.[...]
> We could avoid this self-inflicted confusion by choosing a different
> parameter name.  I'd be good with "csv_fieldsep" or "csvfieldsep".

Make sense to me - with the underscore personally.

David J.



Re: csv format for psql

2018-11-26 Thread Daniel Verite
Tom Lane wrote:

> And, in fact, right now *none* of psql's table output formats is both
> unambiguous and reasonably simple/popular to use.  So the astonishing
> thing about this patch, IMO, is that we didn't do it a decade ago.

Yeah, that's what motivated this submission in the first place.

> I feel that if we allow multi-byte characters here, we might as well
> take the training wheels off and just say you can use any separator
> string you want, as long as it doesn't contain double quote, \r, or \n.

The reason behind disallowing multiple characters was the
likeliness of them being mistakes. For instance, this example came
up at some point in the discussion:
  \pset fieldsep_csv ,,
To me the probability that a user has fat-fingered this is pretty high,
and this would silently produce a really bogus file.

Another kind of mistake comes from the difficulty of properly
quoting on the command line:
psql -- csv -P fieldsep_csv='\t'
would be interpreted as a two-character separator despite being
obviously not the user's intention.

About disallowing characters beyond US-ASCII, I can't find a similar
justification. COPY does not allow them, but it's justified (in the
archives) by the fear of being slower when importing, which is not a
concern here.

> We could avoid this self-inflicted confusion by choosing a different
> parameter name.  I'd be good with "csv_fieldsep" or "csvfieldsep".

+1

> Or we could kill both issues by hard-wiring the separator as ','.

Ideally people would understand that they can use -A for any delimiter
but no quoting, or --csv with strict quoting and in that case a fixed
delimiter is fine, since it's going to be safely quoted, its presence in
the data is not a problem. But I'm not too confident that everyone
would understand it that way, even if it's well explained in the doc.

When one is told "please produce CSV files with semi-colons as
separators", it's simpler to just produce that rather than arguing
that these requirements are probably ill-advised.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-11-26 Thread Magnus Hagander
n Mon, Nov 26, 2018 at 5:53 AM Pavel Stehule 
wrote:

>
>
> po 26. 11. 2018 v 5:36 odesílatel Andrew Gierth <
> and...@tao11.riddles.org.uk> napsal:
>
>> > "Tom" == Tom Lane  writes:
>>
>>  Tom> Or we could kill both issues by hard-wiring the separator as ','.
>>
>> Using ';' for the delimiter isn't all that rare.
>>
>
> ; is default for CSV produced by MS Excel in Czech mutation - so for some
> countries, like CR is common.
>
>
I believe this is a common occurance in countries that use decimal comma
instead of decimal point, which is a fairly large portion of the world.

/Magnus


Re: csv format for psql

2018-11-25 Thread Corey Huinker
On Sun, Nov 25, 2018 at 11:23 PM Tom Lane  wrote:

> Corey Huinker  writes:
> > Could we have another output type called "separated" that uses the
> existing
> > --fieldsep / --recordsep?
>
> Uh, what's the difference from the existing unaligned format?
>

No footer and I guess we'd want to escape instances of fieldsep and
recordsep in the data, so I guess if we had an option to escape instances
of fieldsep/recordsep found in the data, unaligned would work fine.


Re: csv format for psql

2018-11-25 Thread Pavel Stehule
po 26. 11. 2018 v 5:36 odesílatel Andrew Gierth 
napsal:

> > "Tom" == Tom Lane  writes:
>
>  Tom> Or we could kill both issues by hard-wiring the separator as ','.
>
> Using ';' for the delimiter isn't all that rare.
>

; is default for CSV produced by MS Excel in Czech mutation - so for some
countries, like CR is common.

Regards

Pavel

>
> But I don't see any reason to allow multibyte or non-ascii characters or
> arbitrary strings.
>
> --
> Andrew (irc:RhodiumToad)
>
>


Re: csv format for psql

2018-11-25 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 Tom> Or we could kill both issues by hard-wiring the separator as ','.

Using ';' for the delimiter isn't all that rare.

But I don't see any reason to allow multibyte or non-ascii characters or
arbitrary strings.

-- 
Andrew (irc:RhodiumToad)



Re: csv format for psql

2018-11-25 Thread Tom Lane
Corey Huinker  writes:
> Could we have another output type called "separated" that uses the existing
> --fieldsep / --recordsep?

Uh, what's the difference from the existing unaligned format?

regards, tom lane



Re: csv format for psql

2018-11-25 Thread Corey Huinker
>
>
> Or we could kill both issues by hard-wiring the separator as ','.


+1

I've never encountered a situation where a customer wanted a custom
delimiter AND quoted strings. So either they wanted pure CSV or a customed
TSV.

Could we have another output type called "separated" that uses the existing
--fieldsep / --recordsep? Word will get out that csv is faster, but we'd
still have the flexibility if somebody really wanted it.


Re: csv format for psql

2018-11-25 Thread Tom Lane
Fabien COELHO  writes:
> Basically the proposed patch addresses a simple and convenient use case 
> which are neither addressed by \copy nor COPY. The fact that more options 
> are available with these commands does it precludes its usefulness as is.

Yeah, I agree that this option is useful independently of whether COPY
provides something similar.  I think the killer argument is that right
now, psql-ref.sgml repeatedly suggests that unaligned mode with fieldsep
',' is a reasonable way to produce comma-separated output for consumption
by other programs.  That's like handing our users a loaded foot-gun.
And, in fact, right now *none* of psql's table output formats is both
unambiguous and reasonably simple/popular to use.  So the astonishing
thing about this patch, IMO, is that we didn't do it a decade ago.

I went through the documentation to improve that point, and did some other
cosmetic cleanup including rebasing up to HEAD, and got the attached.

I think there are two remaining points to settle:

1. Are we limiting the separator to be a single-byte character or not?
If we are, the code could be made simpler and faster by working with a
"char" value instead of a string.  If we're not, then Michael's change
needs to be undone (but I didn't do that here).

I feel that if we allow multi-byte characters here, we might as well
take the training wheels off and just say you can use any separator
string you want, as long as it doesn't contain double quote, \r, or \n.
Most programs reading a file are not going to perceive a difference
between separating fields with a single multibyte character and multiple
single-byte characters; either they can cope or they can't.  A fair
number of them are going to be in the latter category.  So we can either
say "do what you wish, it's your problem whether anything can read the
result" or "we're going to restrict you to something that (perhaps) is
more widely readable".  I'm a bit inclined to the former viewpoint.
If we were in the business of being restrictive, why would we allow
the field separator to be changed at all?  The name of the format is
*comma* separated values, not something else.

2. Speaking of the field separator, I'm pretty desperately unhappy
with the choice of "fieldsep_csv" as the parameter name.  The trouble
with that is that it encourages sticking "fieldsep_csv" in between
"fieldsep" and "fieldsep_zero", because alphabet.  But "fieldsep" and
"fieldsep_zero" are *intimately* tied together, in fact it's only a
dubious implementation choice that made them separate parameters at all.
It does not make any semantic sense to stick other vaguely-related
parameters in between, neither in the documentation nor in \pset output.

We could avoid this self-inflicted confusion by choosing a different
parameter name.  I'd be good with "csv_fieldsep" or "csvfieldsep".

Or we could kill both issues by hard-wiring the separator as ','.

Thoughts?

regards, tom lane

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 6e6d0f4..d53451b 100644
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** PostgreSQL documentation
*** 68,75 


Switches to unaligned output mode. (The default output mode is
!   otherwise aligned.)  This is equivalent to \pset format
!   unaligned.


  
--- 68,75 


Switches to unaligned output mode. (The default output mode is
!   aligned.)  This is equivalent to
!   \pset format unaligned.


  
*** EOF
*** 152,157 
--- 152,167 
  
  
  
+   --csv
+   
+   
+   Switches to CSV (Comma-Separated Values) output
+   mode.  This is equivalent to \pset format csv.
+   
+   
+ 
+ 
+ 
-d dbname
--dbname=dbname

*** EOF
*** 270,277 
--html


!   Turn on HTML tabular output. This is
!   equivalent to \pset format html or the
\H command.


--- 280,287 
--html


!   Switches to HTML output mode.  This is
!   equivalent to \pset format html or the
\H command.


*** lo_import 152801
*** 2547,2554 


Specifies the field separator to be used in unaligned output
!   format. That way one can create, for example, tab- or
!   comma-separated output, which other programs might prefer. To
set a tab as field separator, type \pset fieldsep
'\t'. The default field separator is
'|' (a vertical bar).
--- 2557,2564 


Specifies the field separator to be used in unaligned output
!   format. That way one can create, for example, tab-separated
!   output, which other programs might prefer. To
set a 

Re: csv format for psql

2018-11-14 Thread Fabien COELHO


Bonjour Michaël,


But again COPY is concerned with importing the data that preexists,
even if it's weird, whereas a psql output formats are not.


Hm.  I checked the contents of the patch in details which provide output
consistent with COPY, but after looking at the global picture I am
getting cold feet on this patch for a couple of reasons:
- This stuff adds new code paths in the frontend mimicking what the
backend already does for years, both doing the same thing.


 - Backend's COPY cannot write to client space, so the comparison
   is not relevant.

 - "\copy (SQL-query) TO STDOUT CSV;" is kind of a pain, because one has
   to edit around the query, which is not convenient, esp from the
   command line:

   sh> psql --csv -c 'SELECT 1 as one, 2 as two' > out.csv

   vs

   sh> psql -c "\copy (SELECT 1 AS one, 2 as two) TO STDOUT CSV" > out.csv

   or mixing the output file name inside the argument, which is very
   unshell like:

   sh> psql -c "\copy (SELECT 1 AS one, 2 as two) TO 'out.csv' CSV"

If you have a "query.sql" file that you want to output in csv, there is no 
simple way to do that with \copy/COPY, whereas "psql --csv -f query.sql" 
looks pretty straightforward to me. Also, in a makefile, I could write:


   %.csv: %.sql
   psql --csv -f $< > $@

My point is that \copy, COPY and the proposed CSV format do not address 
the same use cases.



- There are already three ways to fetch data in this format with COPY,
\copy and file_fdw, with all three using the same code paths for option
validations (I can see the arguments at the top of the thread for which
COPY SELECT can actually do everything you want with?).
- The experience is confusing, as the psql format uses different options
than the backend to do the same things:
-- tuples_only instead of HEADER.
-- fieldsep_csv instead of DELIMITER
-- null is an equivalent of the one with the same name, which is
actually consistent.
-- encoding is also an equivalent of ENCODING.
-- and all the others missing.
That looks like a lot.


I disagree on this one: the proposed csv format just follows the existing 
psql format pattern used for 8 formats and reuses it for csv.


Note that there are existing command line options for tuples_only (-t), 
encoding is inherited from the shell and does not need to be changed that 
often nowadays, fieldsep_csv is kind-of a pain, but then if someone wants 
"comma-separated-values" NOT separated by commas, probably they can handle 
it.


Basically the proposed patch addresses a simple and convenient use case 
which are neither addressed by \copy nor COPY. The fact that more options 
are available with these commands does it precludes its usefulness as is.


--
Fabien.

Re: csv format for psql

2018-11-11 Thread Pavel Stehule
po 12. 11. 2018 v 5:19 odesílatel David G. Johnston <
david.g.johns...@gmail.com> napsal:

> On Friday, November 9, 2018, Michael Paquier  wrote:
>
>> On Fri, Nov 09, 2018 at 05:28:07PM +0100, Daniel Verite wrote:
>> > But again COPY is concerned with importing the data that preexists,
>> > even if it's weird, whereas a psql output formats are not.
>>
>> Hm.  I checked the contents of the patch in details which provide output
>> consistent with COPY, but after looking at the global picture I am
>> getting cold feet on this patch for a couple of reasons:
>> - This stuff adds new code paths in the frontend mimicking what the
>> backend already does for years, both doing the same thing.
>
>
> From the original post:
>
> "copy with csv can't help for the output of meta-commands
> such as \gx, \crosstabview, \l, \d ... whereas a CSV format within psql
> does work with these."
>
> Formatting is usually a client-side concern so this feature fits well
> there. The fact that we’ve provided a server interface for the same doesn’t
> preclude its potential desirability in the client.
>
> - There are already three ways to fetch data in this format with COPY,
>> \copy and file_fdw, with all three using the same code paths for option
>> validations (I can see the arguments at the top of the thread for which
>> COPY SELECT can actually do everything you want with?).
>
>
> Not always conveniently.
>
>
>> - The experience is confusing, as the psql format uses different options
>> than the backend to do the same things:
>
>
> Yes, those who use psql need to learn its features.  I’d posit that since
> this syntax is being learned anyway that transferring said knowledge to a
> newly added csv format will not be confusing.  No more so that having to do
> something that is usually client-side (formatting) on the server in the
> first place.  That we don’t fully replicate the server functionality does’t
> bother be.  This is meant to be a simple and quick ability that handles 95%
> of use cases and defers to the more powerful server version for the
> outliers.
>

This patch is not simple - not due own complexity, but due current state of
psql and output format support. The psql is aged software and implement new
format well known is not simple.

The COPY statement has different purpose and because it has server side
implementation, it cannot to cover client side space.

I afraid so there hardy be designed some better - and it is unhappy so psql
has not native csv support yet.

Regards

Pavel



> Feature-wise I’m on board.  Given it’s already written I’d say it should
> go in unless there are code complexity and support concerns - which given
> the prevalence of other formats I have to believe there are not.
>
> David J.
>
>


Re: csv format for psql

2018-11-11 Thread David G. Johnston
On Friday, November 9, 2018, Michael Paquier  wrote:

> On Fri, Nov 09, 2018 at 05:28:07PM +0100, Daniel Verite wrote:
> > But again COPY is concerned with importing the data that preexists,
> > even if it's weird, whereas a psql output formats are not.
>
> Hm.  I checked the contents of the patch in details which provide output
> consistent with COPY, but after looking at the global picture I am
> getting cold feet on this patch for a couple of reasons:
> - This stuff adds new code paths in the frontend mimicking what the
> backend already does for years, both doing the same thing.


>From the original post:

"copy with csv can't help for the output of meta-commands
such as \gx, \crosstabview, \l, \d ... whereas a CSV format within psql
does work with these."

Formatting is usually a client-side concern so this feature fits well
there. The fact that we’ve provided a server interface for the same doesn’t
preclude its potential desirability in the client.

- There are already three ways to fetch data in this format with COPY,
> \copy and file_fdw, with all three using the same code paths for option
> validations (I can see the arguments at the top of the thread for which
> COPY SELECT can actually do everything you want with?).


Not always conveniently.


> - The experience is confusing, as the psql format uses different options
> than the backend to do the same things:


Yes, those who use psql need to learn its features.  I’d posit that since
this syntax is being learned anyway that transferring said knowledge to a
newly added csv format will not be confusing.  No more so that having to do
something that is usually client-side (formatting) on the server in the
first place.  That we don’t fully replicate the server functionality does’t
bother be.  This is meant to be a simple and quick ability that handles 95%
of use cases and defers to the more powerful server version for the
outliers.

Feature-wise I’m on board.  Given it’s already written I’d say it should go
in unless there are code complexity and support concerns - which given the
prevalence of other formats I have to believe there are not.

David J.


Re: csv format for psql

2018-11-11 Thread Daniel Verite
Michael Paquier wrote:

> - The experience is confusing, as the psql format uses different options
> than the backend to do the same things:
> -- tuples_only instead of HEADER.
> -- fieldsep_csv instead of DELIMITER
> -- null is an equivalent of the one with the same name, which is
> actually consistent.
> -- encoding is also an equivalent of ENCODING.

That conveniently ignores the fact that "\pset format somename"
as a command doesn't take any additional option, contrary to COPY.
We can't do \pset format csv (delimiter=';')
If we choosed "delimiter" as an option name, it would have to exist
within 20 other names in the \pset namespace and then it would be
too vague, whereas "fieldsep_csv" makes it clear what it applies to.
"tuples_only" is preexisting, and I don't see how the comment that it's
not called "header" could be actionable.

Overall, you seem to posit that we should mimic the entire
COPY TO interface to implement 'psql --csv'.
But the starting point is that 'psql --csv' is just a slightly
different (safer) variant of 'psql -At', which is not concerned
at all with being consistent with COPY.
The model of the csv output format is the unaligned output format,
it's just not COPY.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-11-09 Thread Daniel Verite
Michael Paquier wrote:

> Still what's the point except complicating the code?  We don't care
> about anything fancy in the backend-side ProcessCopyOptions() when
> checking cstate->delim, and having some consistency looks like a good
> thing to me.

The backend has its reasons that don't apply to the psql output
format, mostly import performance according to [1]
It's not that nobody wants delimiter outside of US-ASCII,
as people do ask for that sometimes:

https://www.postgresql.org/message-id/f02ulk%242r3u%241%40news.hub.org
https://github.com/greenplum-db/gpdb/issues/1246

> However there is no option to specify
> an escape character, no option to specify a quote character, and it is
> not possible to force quotes for all values.  Those are huge advantages
> as any output can be made compatible with other CSV variants.  Isn't
> what is presented too limited? 

The guidelines that the patch has been following are those of RFC 4180 [2]
with two exceptions on the field separator that we can define
and the end of lines that are OS-dependant instead of the fixed CRLF
that IETF seems to see as the norm.

The only reference to escaping in the RFC is:
   "If double-quotes are used to enclose fields, then a double-quote
   appearing inside a field must be escaped by preceding it with
   another double quote"

The problem with using non-standard QUOTE or ESCAPE is that it's a
violation of the format that goes further than choosing a separator
different than comma, which is already a pain point.
We can always add these options later if there is demand. I suspect it
will never happen.

I looked at the 2004 archives when CSV was added to COPY, that's
around commit 862b20b38 in case anyone cares to look, but
I couldn't find a discussion on these options, all I could find is they were
present from the start.

But again COPY is concerned with importing the data that preexists,
even if it's weird, whereas a psql output formats are not.


[1] https://www.postgresql.org/message-id/4C9D2BC5.1080006%40optonline.net
[2] https://tools.ietf.org/html/rfc4180


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-11-09 Thread Michael Paquier
On Fri, Nov 09, 2018 at 11:40:44AM +0100, Daniel Verite wrote:
> The above is meant to accept a multibyte character as the separator,
> in which case strlen(value) would be greater than 1.

Still what's the point except complicating the code?  We don't care
about anything fancy in the backend-side ProcessCopyOptions() when
checking cstate->delim, and having some consistency looks like a good
thing to me.

I am also questioning a couple of things: it is possible to choose
freely how to represent a null value thanks to "\pset null", which gives
an equivalent to the COPY option.  Choosing if a header is present or
not can be done with tuples_only.  However there is no option to specify
an escape character, no option to specify a quote character, and it is
not possible to force quotes for all values.  Those are huge advantages
as any output can be made compatible with other CSV variants.  Isn't
what is presented too limited?  \copy has at least the advantage to push
down those responsibilities to the backend..
--
Michael


signature.asc
Description: PGP signature


Re: csv format for psql

2018-11-09 Thread Daniel Verite
Michael Paquier wrote:

> +/* check for value being non-empty and with an MB length of 1 */
> +if (*value == '\0' || value[PQmblen(value, pset.encoding)] != '\0')
> 
> It seems to me that this can just be replaced with that:
> if (strlen(value) != 1)

The above is meant to accept a multibyte character as the separator,
in which case strlen(value) would be greater than 1.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-11-08 Thread Pavel Stehule
pá 9. 11. 2018 v 6:57 odesílatel Michael Paquier 
napsal:

> On Thu, Nov 08, 2018 at 01:58:34PM +0900, Michael Paquier wrote:
> > Anyway, I am still going through the patch, so no need to send a new
> > version for now.
>
> Okay, I have done a round of more in-depth review, and the patch looks
> to be in pretty good shape.
>
> Relying on tuples_only to decide if the header should be printed or not
> looks good to me.
>
> +/* check for value being non-empty and with an MB length of 1 */
> +if (*value == '\0' || value[PQmblen(value, pset.encoding)] != '\0')
>
> It seems to me that this can just be replaced with that:
> if (strlen(value) != 1)
>
> Attached is what I am finishing up with for the moment.  Comments are
> welcome.  I am still planning look at that stuff a bit more once again,
> particularly the printing part, but I am lacking of time now..
>

looks ok

Pavel

--
> Michael
>


Re: csv format for psql

2018-11-07 Thread Michael Paquier
On Wed, Nov 07, 2018 at 05:36:54PM +0100, Daniel Verite wrote:
> I guess it could go into a #define in psql/settings.h, along with
> these:
> #define DEFAULT_FIELD_SEP "|"
> #define DEFAULT_RECORD_SEP "\n"

Yes.

> Yes, since we support column names with embedded newlines (even
> though it's hard to think of a legit use case for that) and CSV fields
> support embedded newlines too.

Hm.  We would likely want at least a comment about that..

Anyway, I am still going through the patch, so no need to send a new
version for now.
--
Michael


signature.asc
Description: PGP signature


Re: csv format for psql

2018-11-07 Thread Daniel Verite
Michael Paquier wrote:

> -C could also be useful for other things, say compression.

The patch does not grab any short letter, as the consensus
went that way.

> +   pset.popt.topt.fieldSepCsv = pg_strdup(",");
> Let's store that in a variable instead of hardcoding it.

I guess it could go into a #define in psql/settings.h, along with
these:
#define DEFAULT_FIELD_SEP "|"
#define DEFAULT_RECORD_SEP "\n"

> In the regression tests, "col 9" is wanted with a newline?

Yes, since we support column names with embedded newlines (even
though it's hard to think of a legit use case for that) and CSV fields
support embedded newlines too.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-11-07 Thread Fabien COELHO


Bonjour Michaël,


First, it seems to me that it is necessary to overcomplicate the
interface of psql for that.  Most of the other formats don't have their
own switch, and it is perfectly possible to use CSV format with just
--pset=format=csv.


Although this work, it is not very user friendly nor easy to memorize.

Quite a few direct format options already exists, eg -A --no-align (yuk, 
-A means not to align... :-(), -H --html...


In this context "--csv" looks pretty unambiguous and more useful, so I 
would be in favor of keeping this option.



-C could also be useful for other things, say compression.


Sure. -C is *NOT* used for --csv in the patch, although it may have been 
with an earlier version?



In the regression tests, "col 9" is wanted with a newline?


Yes, the point is to check whether quoting & escaping are triggered 
appropriately in column names and values.


--
Fabien.

Re: csv format for psql

2018-11-07 Thread Michael Paquier
On Tue, Nov 06, 2018 at 02:57:15PM +0100, Daniel Verite wrote:
> Here's a rebased version following these changes.

Thanks for doing a rebase.

First, it seems to me that it is necessary to overcomplicate the
interface of psql for that.  Most of the other formats don't have their
own switch, and it is perfectly possible to use CSV format with just
--pset=format=csv.  -C could also be useful for other things, say
compression.

Not reusing fieldsep and creating a new variable looks like the correct
way to go, as there is no need to bother about cross-option checks if
the format is moved from unaligned to csv and the other way around,
especially as the delimiter of CSV needs cannot be larger than
one-byte.

+   pset.popt.topt.fieldSepCsv = pg_strdup(",");
Let's store that in a variable instead of hardcoding it.

In the regression tests, "col 9" is wanted with a newline?

I am still digging into the details of this patch, and I am
unfortunately running out of fuel for the day.
--
Michael


signature.asc
Description: PGP signature


Re: csv format for psql

2018-11-06 Thread Daniel Verite
Michael Paquier wrote:

> Ordering them in alphabetical order is a good idea due to the high
> number of options available, and more would pile up even if this
> separates a bit "aligned" and "unaligned", so I have have separated
> those diffs from the core patch and committed it, leaving the core
> portion of the patch aside for later.

Here's a rebased version following these changes.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 13a8b68..98147ef 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -672,6 +672,10 @@ COPY count
 
   
CSV Format
+   
+CSV
+in COPY
+   
 

 This format option is used for importing and exporting the Comma
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index a1ca940..2897486 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -152,6 +152,20 @@ EOF
 
 
 
+  --csv
+  
+  
+  
+   CSV
+   in psql
+  
+  Switches to CSV output mode. This is equivalent
+  to \pset format csv.
+  
+  
+
+
+
   -d dbname
   --dbname=dbname
   
@@ -2557,6 +2571,19 @@ lo_import 152801
   
 
   
+  fieldsep_csv
+  
+  
+  Specifies the field separator to be used in the
+  CSV format. When the separator appears in a field
+  value, that field is output inside double quotes according to
+  CSV rules. To set a tab as field separator, type
+  \pset fieldsep_csv '\t'. The default is a comma.
+  
+  
+  
+
+  
   fieldsep_zero
   
   
@@ -2584,11 +2611,16 @@ lo_import 152801
   format
   
   
-  Sets the output format to one of aligned,
-  asciidoc, html,
+  Sets the output format to one of
+  aligned,
+  asciidoc,
+  csv,
+  html,
   latex (uses tabular),
-  latex-longtable, troff-ms,
-  unaligned, or wrapped.
+  latex-longtable,
+  troff-ms,
+  unaligned,
+  or wrapped.
   Unique abbreviations are allowed.  (That would mean one letter
   is enough.)
   
@@ -2596,14 +2628,27 @@ lo_import 152801
   unaligned format writes all columns of a 
row on one
   line, separated by the currently active field separator. This
   is useful for creating output that might be intended to be read
-  in by other programs (for example, tab-separated or comma-separated
-  format).
+  in by other programs.
   
 
   aligned format is the standard, 
human-readable,
   nicely formatted text output;  this is the default.
   
 
+ csv format writes columns separated by
+ commas, applying the quoting rules described in RFC 4180.
+ Alternative separators can be selected with
+ \pset fieldsep_csv.
+ The output is compatible with the CSV format of the
+ COPY command. The header with column names
+ is output unless the tuples_only parameter is
+ on. Title and footers are not printed.
+ Each row is terminated by the system-dependent end-of-line character,
+ which is typically a single newline (\n) for
+ Unix-like systems or a carriage return and newline sequence
+ (\r\n) for Microsoft Windows.
+ 
+
   wrapped format is like 
aligned but wraps
   wide data values across lines to make the output fit in the target
   column width.  The target width is determined as described under
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 0dea54d..ea064ab 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1941,8 +1941,8 @@ exec_command_pset(PsqlScanState scan_state, bool 
active_branch)
 
int i;
static const char *const my_list[] = {
-   "border", "columns", "expanded", "fieldsep", 
"fieldsep_zero",
-   "footer", "format", "linestyle", "null",
+   "border", "columns", "expanded", "fieldsep", 
"fieldsep_csv",
+   "fieldsep_zero", "footer", "format", 
"linestyle", "null",
"numericlocale", "pager", "pager_min_lines",
"recordsep", "recordsep_zero",
"tableattr", "title", "tuples_only",
@@ -3566,6 +3566,9 @@ _align2string(enum printFormat in)
case PRINT_ASCIIDOC:
return "asciidoc";
break;
+   case PRINT_CSV:
+   return "csv";
+   

Re: csv format for psql

2018-11-05 Thread Michael Paquier
On Sat, Nov 03, 2018 at 09:45:14AM +0100, Fabien COELHO wrote:
> Patch applies cleanly, compiles, make check ok, doc gen ok.
> 
> Fine with me. I switched the patch to "Ready".

I have begun looking at this patch, and there are some noise diffs
because of the reordering of the items you are doing in psql code.
Ordering them in alphabetical order is a good idea due to the high
number of options available, and more would pile up even if this
separates a bit "aligned" and "unaligned", so I have have separated
those diffs from the core patch and committed it, leaving the core
portion of the patch aside for later.
--
Michael


signature.asc
Description: PGP signature


Re: csv format for psql

2018-11-03 Thread Fabien COELHO



Bonjour Daniel,


Here's a rebased version with a couple regression tests
added per the discussions during the previous CF.

Now at https://commitfest.postgresql.org/20/1861/


Patch applies cleanly, compiles, make check ok, doc gen ok.

Fine with me. I switched the patch to "Ready".

--
Fabien.



Re: csv format for psql

2018-11-02 Thread Daniel Verite
Michael Paquier wrote:

> If you can produce a new  version, please feel free to post it.

Here's a rebased version with a couple regression tests
added per the discussions during the previous CF.

Now at https://commitfest.postgresql.org/20/1861/

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 13a8b68..98147ef 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -672,6 +672,10 @@ COPY count
 
   
CSV Format
+   
+CSV
+in COPY
+   
 

 This format option is used for importing and exporting the Comma
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index eb9d93a..7617c5e 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -152,6 +152,20 @@ EOF
 
 
 
+  --csv
+  
+  
+  
+   CSV
+   in psql
+  
+  Switches to CSV output mode. This is equivalent
+  to \pset format csv.
+  
+  
+
+
+
   -d dbname
   --dbname=dbname
   
@@ -2557,6 +2571,19 @@ lo_import 152801
   
 
   
+  fieldsep_csv
+  
+  
+  Specifies the field separator to be used in the
+  CSV format. When the separator appears in a field
+  value, that field is output inside double quotes according to
+  CSV rules. To set a tab as field separator, type
+  \pset fieldsep_csv '\t'. The default is a comma.
+  
+  
+  
+
+  
   fieldsep_zero
   
   
@@ -2584,12 +2611,11 @@ lo_import 152801
   format
   
   
-  Sets the output format to one of unaligned,
-  aligned, wrapped,
-  html, asciidoc,
+  Sets the output format to one of aligned,
+  asciidoc, csv, 
html,
   latex (uses tabular),
-  latex-longtable, or
-  troff-ms.
+  latex-longtable, troff-ms,
+  unaligned, or wrapped.
   Unique abbreviations are allowed.  (That would mean one letter
   is enough.)
   
@@ -2597,14 +2623,27 @@ lo_import 152801
   unaligned format writes all columns of a 
row on one
   line, separated by the currently active field separator. This
   is useful for creating output that might be intended to be read
-  in by other programs (for example, tab-separated or comma-separated
-  format).
+  in by other programs.
   
 
   aligned format is the standard, 
human-readable,
   nicely formatted text output;  this is the default.
   
 
+ csv format writes columns separated by
+ commas, applying the quoting rules described in RFC 4180.
+ Alternative separators can be selected with
+ \pset fieldsep_csv.
+ The output is compatible with the CSV format of the
+ COPY command. The header with column names
+ is output unless the tuples_only parameter is
+ on. Title and footers are not printed.
+ Each row is terminated by the system-dependent end-of-line character,
+ which is typically a single newline (\n) for
+ Unix-like systems or a carriage return and newline sequence
+ (\r\n) for Microsoft Windows.
+ 
+
   wrapped format is like 
aligned but wraps
   wide data values across lines to make the output fit in the target
   column width.  The target width is determined as described under
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 5b4d54a..8d0ad71 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1941,8 +1941,8 @@ exec_command_pset(PsqlScanState scan_state, bool 
active_branch)
 
int i;
static const char *const my_list[] = {
-   "border", "columns", "expanded", "fieldsep", 
"fieldsep_zero",
-   "footer", "format", "linestyle", "null",
+   "border", "columns", "expanded", "fieldsep", 
"fieldsep_csv",
+   "fieldsep_zero", "footer", "format", 
"linestyle", "null",
"numericlocale", "pager", "pager_min_lines",
"recordsep", "recordsep_zero",
"tableattr", "title", "tuples_only",
@@ -3584,6 +3584,9 @@ _align2string(enum printFormat in)
case PRINT_TROFF_MS:
return "troff-ms";
break;
+   case PRINT_CSV:
+   return "csv";
+   break;
}
return "unknown";
 }
@@ -3639,25 +3642,27 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
{
if 

Re: csv format for psql

2018-09-30 Thread Michael Paquier
On Tue, Aug 28, 2018 at 08:54:02PM +0200, Fabien COELHO wrote:
> Patch applies cleanly, compiles, "make check" ok.
> 
> I tried "\pset fieldsep_csv '\0'" which could be seen as one character, but
> it does not want it. I'm okay with this behavior.
> 
> I'd suggest to add a test about rejected fieldsep_csv values, which raises
> both errors.

The patch has been waiting for input from the author since August, so I
am marking it as returned with feedback.  If you can produce a new
version, please feel free to post it.
--
Michael


signature.asc
Description: PGP signature


Re: csv format for psql

2018-08-28 Thread Fabien COELHO



Bonjour Daniel,


Currently popt->topt.fieldSepCsv can't be NULL so I've simplified this
to just return pset_quoted_string(popt->topt.fieldSepCsv).


Ok.


While testing I found a small issue if "fieldsep_csv" is set to a strange
value:

   \pset format_csv ',,'


You meant \pset fieldsep_csv ',,'


Indeed.


If you do that even SELECT 'foo', 'bar' comes out wrong because it looks
like a 3-field row: foo,,bar


Yes and no. I asked for ",," as a separator, so probably I really want 
that and I'd be okay with the result.



If we want to prevent people to shoot themselves in the foot with that
sort of thing, I've added a couple tests: No double quote, no LF or
CR, single character (but multibyte allowed) for the separator.


Ok, why not.

Patch applies cleanly, compiles, "make check" ok.

I tried "\pset fieldsep_csv '\0'" which could be seen as one character, 
but it does not want it. I'm okay with this behavior.


I'd suggest to add a test about rejected fieldsep_csv values, which raises 
both errors.


--
Fabien.



Re: csv format for psql

2018-08-28 Thread Daniel Verite
Fabien COELHO wrote:

> My point was more about the documentation which should be clear about what 
> is the EOL. I understand from your point above that the EOL is the 
> platform-specific one, which is yet again fine with me, but should be said 
> clearly in the documentation?

Okay, I've added a bit in the doc.

>   + else if (strcmp(param, "fieldsep_csv") == 0)
>   + return pset_quoted_string(popt->topt.fieldSepCsv
>   +   ? popt->topt.fieldSepCsv
>   +   : "");
> 
> It is unclear to me when this "" is triggered. Never? If so, maybe a 
> comment should say so?

Currently popt->topt.fieldSepCsv can't be NULL so I've simplified this
to just return pset_quoted_string(popt->topt.fieldSepCsv).

> Why removing "-C"? As we already have "-A" an "-H", I was fine with it.

It was a leftover from v3. Participants in the thread don't seem to
want the short option, to my surprise. Pavel argued first against -C
upthread, I argued quite a bit in favor of it, the "for" had 0 upvote, and
"against" had at least 4 I think, including yours in [1].


> It seems that you changed the indentation in "psql-ref.sgml":
> 
>
>   -
>   + 

Fixed.

>   -  {"unaligned", "aligned", "wrapped", "html", "asciidoc",
>   -   "latex", "latex-longtable", "troff-ms", NULL};
>   +  {"aligned", "asciidoc", "csv", "html", "latex", "latex-longtable",
>   +   "unaligned", "troff-ms", "wrapped", NULL};
> 
> If you want alphabetical, 'u' > 't'.

Fixed.

> 
> While testing I found a small issue if "fieldsep_csv" is set to a strange 
> value:
> 
>\pset format_csv ',,'
>SELECT ',', ',';
>-- gives the ambiguous:
>
> 
> The rule to decide whether to quote should be made safer/smarter. I'd 
> suggest that if the string contains any of the caracters used in format 
> csv it should be quoted.

You meant \pset fieldsep_csv ',,'
If you do that even SELECT 'foo', 'bar' comes out wrong because it looks
like a 3-field row: foo,,bar
If we want to prevent people to shoot themselves in the foot with that
sort of thing, I've added a couple tests: No double quote, no LF or
CR, single character (but multibyte allowed) for the separator.


[1]
https://www.postgresql.org/message-id/alpine.DEB.2.20.1803081004241.2916%40lancre


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 13a8b68..98147ef 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -672,6 +672,10 @@ COPY count
 
   
CSV Format
+   
+CSV
+in COPY
+   
 

 This format option is used for importing and exporting the Comma
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index eb9d93a..7617c5e 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -152,6 +152,20 @@ EOF
 
 
 
+  --csv
+  
+  
+  
+   CSV
+   in psql
+  
+  Switches to CSV output mode. This is equivalent
+  to \pset format csv.
+  
+  
+
+
+
   -d dbname
   --dbname=dbname
   
@@ -2557,6 +2571,19 @@ lo_import 152801
   
 
   
+  fieldsep_csv
+  
+  
+  Specifies the field separator to be used in the
+  CSV format. When the separator appears in a field
+  value, that field is output inside double quotes according to
+  CSV rules. To set a tab as field separator, type
+  \pset fieldsep_csv '\t'. The default is a comma.
+  
+  
+  
+
+  
   fieldsep_zero
   
   
@@ -2584,12 +2611,11 @@ lo_import 152801
   format
   
   
-  Sets the output format to one of unaligned,
-  aligned, wrapped,
-  html, asciidoc,
+  Sets the output format to one of aligned,
+  asciidoc, csv, 
html,
   latex (uses tabular),
-  latex-longtable, or
-  troff-ms.
+  latex-longtable, troff-ms,
+  unaligned, or wrapped.
   Unique abbreviations are allowed.  (That would mean one letter
   is enough.)
   
@@ -2597,14 +2623,27 @@ lo_import 152801
   unaligned format writes all columns of a 
row on one
   line, separated by the currently active field separator. This
   is useful for creating output that might be intended to be read
-  in by other programs (for example, tab-separated or comma-separated
-  format).
+  in by other programs.
   
 
   aligned format is the standard, 
human-readable,
   nicely formatted text output;  this is the default.
   
 
+ csv format writes columns separated by
+ commas, applying the 

Re: csv format for psql

2018-08-23 Thread Fabien COELHO



Bonjour Daniel,


'\n' gets translated by libc when the output is in text mode.
We discussed this upthread, but maybe it should be a code comment:
added now.


My point was more about the documentation which should be clear about what 
is the EOL. I understand from your point above that the EOL is the 
platform-specific one, which is yet again fine with me, but should be said 
clearly in the documentation?


There seems to be a test in the code to set an empty string "" by 
default, but it is unclear to me when this is triggered.


Where is that code?


 + else if (strcmp(param, "fieldsep_csv") == 0)
 + return pset_quoted_string(popt->topt.fieldSepCsv
 +   ? popt->topt.fieldSepCsv
 +   : "");

It is unclear to me when this "" is triggered. Never? If so, maybe a 
comment should say so?


A few comments about v6: Patch applies cleanly, make check ok, doc build 
ok.


Why removing "-C"? As we already have "-A" an "-H", I was fine with it.

It seems that you changed the indentation in "psql-ref.sgml":

  
 -
 + 

but maybe this is just the previous entry which was misaligned...

 -  {"unaligned", "aligned", "wrapped", "html", "asciidoc",
 -   "latex", "latex-longtable", "troff-ms", NULL};
 +  {"aligned", "asciidoc", "csv", "html", "latex", "latex-longtable",
 +   "unaligned", "troff-ms", "wrapped", NULL};

If you want alphabetical, 'u' > 't'.

While testing I found a small issue if "fieldsep_csv" is set to a strange 
value:


  \pset format_csv ',,'
  SELECT ',', ',';
  -- gives the ambiguous:
  

The rule to decide whether to quote should be made safer/smarter. I'd 
suggest that if the string contains any of the caracters used in format 
csv it should be quoted.


--
Fabien.



Re: csv format for psql

2018-08-22 Thread Daniel Verite
Fabien COELHO wrote:

> Doc: "according to the csv rules" -> "according to csv rules."?

Fixed.

> Doc: "RFC-4180" -> "RFC 4180"?

Fixed. The other references to RFCs use this syntax indeed.

> The previous RFC specifies CRLF as eol, but '\n' is hardcoded in the 
> source. I'm fine with that, but I'd suggest that the documentation should 
> said which EOL is used.

'\n' gets translated by libc when the output is in text mode.
We discussed this upthread, but maybe it should be a code comment:
added now.

> ISTM that "--csv" & "-C" are not documented, neither in sgml nor under
> --help.
> 
> "fieldsep_csv" does not show on the list of output options under "\?".

Oops, fixed.

> There seems to be a test in the code to set an empty string "" by default,
> but it is unclear to me when this is triggered.

Where is that code?

> I'd tend to use "CSV" instead of "csv" everywhere it makes sense, eg in 
> the doc (CSV rules) and in variable names in the code (FooCsv -> FooCSV?), 
> but that is pretty debatable.

I've changed to upper case in a couple places and added  tags,
but depending on the context sometimes lower case feels more consistent.
This is the same as, for instance, ASCII. We display \pset linestyle as
ascii, not ASCII, presumably because everything else in the \pset area
is lower case. But both cases are accepted in input.

Also added a CSV entry in the doc index per Alvaro's suggestion in
https://www.postgresql.org/message-id/20180809202125.r4mdx2jzm7hytetz@alvherre.pgsql
with pointers to psql and COPY.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 13a8b68..98147ef 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -672,6 +672,10 @@ COPY count
 
   
CSV Format
+   
+CSV
+in COPY
+   
 

 This format option is used for importing and exporting the Comma
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index eb9d93a..80cb843 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -151,7 +151,21 @@ EOF
   
 
 
-
+ 
+  --csv
+  
+  
+  
+   CSV
+   in psql
+  
+  Switches to CSV output mode. This is equivalent
+  to \pset format csv.
+  
+  
+ 
+
+ 
   -d dbname
   --dbname=dbname
   
@@ -2557,6 +2571,19 @@ lo_import 152801
   
 
   
+  fieldsep_csv
+  
+  
+  Specifies the field separator to be used in the
+  CSV format. When the separator appears in a field
+  value, that field is output inside double quotes according to
+  CSV rules. To set a tab as field separator, type
+  \pset fieldsep_csv '\t'. The default is a comma.
+  
+  
+  
+
+  
   fieldsep_zero
   
   
@@ -2584,12 +2611,11 @@ lo_import 152801
   format
   
   
-  Sets the output format to one of unaligned,
-  aligned, wrapped,
-  html, asciidoc,
+  Sets the output format to one of aligned,
+  asciidoc, csv, 
html,
   latex (uses tabular),
-  latex-longtable, or
-  troff-ms.
+  latex-longtable, troff-ms,
+  unaligned, or wrapped.
   Unique abbreviations are allowed.  (That would mean one letter
   is enough.)
   
@@ -2597,14 +2623,23 @@ lo_import 152801
   unaligned format writes all columns of a 
row on one
   line, separated by the currently active field separator. This
   is useful for creating output that might be intended to be read
-  in by other programs (for example, tab-separated or comma-separated
-  format).
+  in by other programs.
   
 
   aligned format is the standard, 
human-readable,
   nicely formatted text output;  this is the default.
   
 
+ csv format writes columns separated by
+ commas, applying the quoting rules described in RFC 4180.
+ Alternative separators can be selected with
+ \pset fieldsep_csv.
+ The output is compatible with the CSV format of the
+ COPY command. The header with column names
+ is output unless the tuples_only parameter is
+ on. Title and footers are not printed.
+ 
+
   wrapped format is like 
aligned but wraps
   wide data values across lines to make the output fit in the target
   column width.  The target width is determined as described under
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 5b4d54a..3b47b38 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1941,8 +1941,8 @@ exec_command_pset(PsqlScanState scan_state, bool 
active_branch)
 
int i;

Re: csv format for psql

2018-08-11 Thread Pavel Stehule
2018-08-10 12:25 GMT+02:00 Daniel Verite :

> Pavel Stehule wrote:
>
> > > On the whole I'm inclined to resubmit the patch with
> > > fieldsep_csv and some minor changes based on the rest
> > > of the discussion.
> > >
> >
> > +1
>
> PFA an updated version.
> Usage from the command line:
> $ psql --csv  # or -P format=csv
> $ psql --csv -P fieldsep_csv=";"  # for non-comma csv separator
>
> From inside psql:
>
> \pset format csv
> \pset fieldsep_csv '\t'
>
>
quick check +1 - I have not a objections

Regards

Pavel


>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>


Re: csv format for psql

2018-08-10 Thread Fabien COELHO



Hello Daniel,


PFA an updated version.
Usage from the command line:
$ psql --csv  # or -P format=csv
$ psql --csv -P fieldsep_csv=";"  # for non-comma csv separator

From inside psql:

\pset format csv
\pset fieldsep_csv '\t'


Patch applies cleanly, compiles, global make check ok.

Doc: "according to the csv rules" -> "according to csv rules."?

Doc: "RFC-4180" -> "RFC 4180"?

The previous RFC specifies CRLF as eol, but '\n' is hardcoded in the 
source. I'm fine with that, but I'd suggest that the documentation should 
said which EOL is used.


ISTM that "--csv" & "-C" are not documented, neither in sgml nor under
--help.

"fieldsep_csv" does not show on the list of output options under "\?".

There seems to be a test in the code to set an empty string "" by default,
but it is unclear to me when this is triggered.

I'd tend to use "CSV" instead of "csv" everywhere it makes sense, eg in 
the doc (CSV rules) and in variable names in the code (FooCsv -> FooCSV?), 
but that is pretty debatable.


--
Fabien.



Re: csv format for psql

2018-08-10 Thread Daniel Verite
Pavel Stehule wrote:

> > On the whole I'm inclined to resubmit the patch with
> > fieldsep_csv and some minor changes based on the rest
> > of the discussion.
> >
> 
> +1

PFA an updated version.
Usage from the command line:
$ psql --csv  # or -P format=csv
$ psql --csv -P fieldsep_csv=";"  # for non-comma csv separator

From inside psql:

\pset format csv
\pset fieldsep_csv '\t'


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index eb9d93a..c4fd958 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2557,6 +2557,19 @@ lo_import 152801
   
 
   
+  fieldsep_csv
+  
+  
+  Specifies the field separator to be used in the csv format.
+  When the separator appears in a field value, that field
+  is output inside double quotes according to the csv rules.
+  To set a tab as field separator, type \pset fieldsep_csv
+  '\t'. The default is a comma.
+  
+  
+  
+
+  
   fieldsep_zero
   
   
@@ -2585,7 +2598,7 @@ lo_import 152801
   
   
   Sets the output format to one of unaligned,
-  aligned, wrapped,
+  aligned, csv, 
wrapped,
   html, asciidoc,
   latex (uses tabular),
   latex-longtable, or
@@ -2597,14 +2610,22 @@ lo_import 152801
   unaligned format writes all columns of a 
row on one
   line, separated by the currently active field separator. This
   is useful for creating output that might be intended to be read
-  in by other programs (for example, tab-separated or comma-separated
-  format).
+  in by other programs.
   
 
   aligned format is the standard, 
human-readable,
   nicely formatted text output;  this is the default.
   
 
+ csv format writes columns separated by
+ commas, applying the quoting rules described in RFC-4180.
+ Alternative separators can be selected with \pset 
fieldsep_csv.
+ The output is compatible with the CSV format of the 
COPY
+ command. The header with column names is output unless the
+ tuples_only parameter is on.
+ Title and footers are not printed.
+ 
+
   wrapped format is like 
aligned but wraps
   wide data values across lines to make the output fit in the target
   column width.  The target width is determined as described under
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 5b4d54a..77ed105 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1941,8 +1941,8 @@ exec_command_pset(PsqlScanState scan_state, bool 
active_branch)
 
int i;
static const char *const my_list[] = {
-   "border", "columns", "expanded", "fieldsep", 
"fieldsep_zero",
-   "footer", "format", "linestyle", "null",
+   "border", "columns", "expanded", "fieldsep", 
"fieldsep_csv",
+   "fieldsep_zero", "footer", "format", 
"linestyle", "null",
"numericlocale", "pager", "pager_min_lines",
"recordsep", "recordsep_zero",
"tableattr", "title", "tuples_only",
@@ -3584,6 +3584,9 @@ _align2string(enum printFormat in)
case PRINT_TROFF_MS:
return "troff-ms";
break;
+   case PRINT_CSV:
+   return "csv";
+   break;
}
return "unknown";
 }
@@ -3639,25 +3642,27 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
{
if (!value)
;
-   else if (pg_strncasecmp("unaligned", value, vallen) == 0)
-   popt->topt.format = PRINT_UNALIGNED;
else if (pg_strncasecmp("aligned", value, vallen) == 0)
popt->topt.format = PRINT_ALIGNED;
-   else if (pg_strncasecmp("wrapped", value, vallen) == 0)
-   popt->topt.format = PRINT_WRAPPED;
-   else if (pg_strncasecmp("html", value, vallen) == 0)
-   popt->topt.format = PRINT_HTML;
else if (pg_

Re: csv format for psql

2018-06-03 Thread Pavel Stehule
2018-06-03 19:16 GMT+02:00 David G. Johnston :

> On Sunday, June 3, 2018, Pavel Stehule  wrote:
>>
>>
>> \pset fieldsep ;
>> \pset format csv
>>
>> I don't like when one command overwrite settings of some other command.
>> We can introduce some similar like GUC where default values from configure
>> file can be overwritten by custom setting for session. So I am able to
>> thinking about some settings
>>
>> like
>>
>> \pset csv_default_separator
>> \pset csv_default_header
>>
>> Then there is question to simplify all and use \pset csv_separator, and
>> csv format independent of fieldseparator value? It is possible, but I don't
>> think so have more option for similar value is good idea (for interactive
>> mode).
>>
>
> Having a dedicated option seems infinitely better than adding new settings
> for defaults and having to keep track of whether the shared field separator
> is a default versus a user specified value.
>
> Recently we reworked server GUCs to avoid this kind of unset/default
> behavior.  I don't see how introducing or relying upon it in psql would be
> an advantage.
>

I am thinking so psql design is little bit special, because we should to
think about more modes - interactive and not interactive, and our goal
should be some consistency from user perspective in interactive mode. But
for me, the special CSV options are acceptable - although little bit
unfriendly from user perspective.


> At this point -1, keep the status quo, for any implementation that tries
> to make the unaligned mode field separator perform double duty.  I'm open,
> but unlikely, to be convinced that it can be done without unforeseen bad
> side effects and degraded usability.
>

With respect to your opinion, I don't agree so current status is good -
mainly in this case.

Regards

Pavel



> David J.
>


Re: csv format for psql

2018-06-03 Thread David G. Johnston
On Sunday, June 3, 2018, Pavel Stehule  wrote:
>
>
> \pset fieldsep ;
> \pset format csv
>
> I don't like when one command overwrite settings of some other command. We
> can introduce some similar like GUC where default values from configure
> file can be overwritten by custom setting for session. So I am able to
> thinking about some settings
>
> like
>
> \pset csv_default_separator
> \pset csv_default_header
>
> Then there is question to simplify all and use \pset csv_separator, and
> csv format independent of fieldseparator value? It is possible, but I don't
> think so have more option for similar value is good idea (for interactive
> mode).
>

Having a dedicated option seems infinitely better than adding new settings
for defaults and having to keep track of whether the shared field separator
is a default versus a user specified value.

Recently we reworked server GUCs to avoid this kind of unset/default
behavior.  I don't see how introducing or relying upon it in psql would be
an advantage.

At this point -1, keep the status quo, for any implementation that tries to
make the unaligned mode field separator perform double duty.  I'm open, but
unlikely, to be convinced that it can be done without unforeseen bad side
effects and degraded usability.

David J.


Re: csv format for psql

2018-06-03 Thread Pavel Stehule
Hi

2018-04-07 14:23 GMT+02:00 Daniel Verite :

> Peter Eisentraut wrote:
>
> > Another thought: Isn't CSV just the same as unaligned output plus some
> > quoting?  Could we add a quote character setting and then define --csv
> > to be quote-character = " and fieldsep = , ?
>
> Plus footer set to off. So --csv would be like
> \pset format unaligned
> \pset fieldsep ','
> \pset footer off
> \pset unaligned_quote_character '"'
>
> I guess we'd add a \csv command that does these together.
>
> There would still be some discomfort with some of the issues
> discussed upthread. For instance
>   psql -F ';' --csv
> is likely to reset fieldsep to ',' having then a different outcome than
>   psql --csv -F';'
>
> And there's the point on how to come back from \csv to another
> output, given that it has overwritten "footer" that is used across
> all formats, and fieldsep used by unaligned.
> So a user might need to figure out anyway the
> intricaties behind \csv, if it's not an independant format.
>
> On the whole I'm inclined to resubmit the patch with
> fieldsep_csv and some minor changes based on the rest
> of the discussion.
>
>
Can we restart discussion about this feature? I am sure so there is strong
common agreement about usability of this feature. Now, we have good enough
time to design and implement it well.

I am starting this discussion with some initial requirement and we can talk
about validity of these requirements first.

1. command line long option --csv set output format CSV, default separator
should be ','
2. option -F should to work too (there is (was) first issue - this value is
now semicolon). I don't like introduction a new like "-F" option just for
CSV, because we can share this option for all output formats where
separator can be customized.
3. The behave should not depend on order of usage
4. I miss some short/long option for enabling/disabling header (default
should be on), there is option -t, that is inverse value of "use header". I
am not sure if it is enough.
5. In interactive mode I prefer activating CSV format via \pset format csv
.. like almost all any other supported formats

It should to set output configuration like mentioned

\pset format unaligned
\pset fieldsep ',' -- if not assigned before
\pset footer off
\pset unaligned_quote_character '"'

similarly like command line I am expecting so I can set \pset fieldsep and
\pset format in any order.

expected behave:

psql -c "select ..." --csv -F';'
psql -c "select ..." -F';' --csv

interactive

\pset format csv
\pset fieldsep ;

or

\pset fieldsep ;
\pset format csv

I don't like when one command overwrite settings of some other command. We
can introduce some similar like GUC where default values from configure
file can be overwritten by custom setting for session. So I am able to
thinking about some settings

like

\pset csv_default_separator
\pset csv_default_header

Then there is question to simplify all and use \pset csv_separator, and csv
format independent of fieldseparator value? It is possible, but I don't
think so have more option for similar value is good idea (for interactive
mode).

Why I propose these rich and strong requirements?

1. CSV is fundamental format - I don't know any other with similar value
for data import/export
2. unfortunately, important feature (separator) is not same in all areas -
in Czech rep most used separator is semicolon - so typical case needs
setting format and setting separator.

Regards

Pavel


Re: csv format for psql

2018-04-08 Thread Pavel Stehule
2018-04-07 14:23 GMT+02:00 Daniel Verite :

> Peter Eisentraut wrote:
>
> > Another thought: Isn't CSV just the same as unaligned output plus some
> > quoting?  Could we add a quote character setting and then define --csv
> > to be quote-character = " and fieldsep = , ?
>
> Plus footer set to off. So --csv would be like
> \pset format unaligned
> \pset fieldsep ','
> \pset footer off
> \pset unaligned_quote_character '"'
>
> I guess we'd add a \csv command that does these together.
>
> There would still be some discomfort with some of the issues
> discussed upthread. For instance
>   psql -F ';' --csv
> is likely to reset fieldsep to ',' having then a different outcome than
>   psql --csv -F';'
>
> And there's the point on how to come back from \csv to another
> output, given that it has overwritten "footer" that is used across
> all formats, and fieldsep used by unaligned.
> So a user might need to figure out anyway the
> intricaties behind \csv, if it's not an independant format.
>
> On the whole I'm inclined to resubmit the patch with
> fieldsep_csv and some minor changes based on the rest
> of the discussion.
>

+1


>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>


Re: csv format for psql

2018-04-07 Thread Daniel Verite
Daniel Verite wrote:

> The output might still differ compared to COPY in that line endings
> depend on the client-side OS. There's also the minor issue
> of a single \ by itself on a line, which gets quoted by COPY
> and not by psql.

I meant \. or backslash followed by period.
This sequence means nothing in particular in csv outside
of COPY, but it means end-of-data for COPY.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-04-07 Thread Daniel Verite
Isaac Morland wrote:

> OK, mostly trying to avoid commenting because I doubt I have much to add.
> But. If I ask for CSV and don't specify any overrides, I expect to get
> "C"omma separated values, not some other character. More specifically, if I
> say --csv I expect to get files that are identical with what I would get if
> I used COPY ... CSV. Actually, COPY ... CSV HEADER, given that psql shows
> column headings. This also implies that I expect the quoting and related
> details that are associated with CSV.

Yes, this is what the current patch does. Only the first version
used the same default delimiter as the unaligned mode, but
nobody liked that part.
As for the header, it's driven by how it's already done for other
formats in psql, with the tuples_only setting, which implies 
that column names are output by default as you mention.
Both settings must be changeable, the exact details of how/when
is what's being discussed.

The output might still differ compared to COPY in that line endings
depend on the client-side OS. There's also the minor issue
of a single \ by itself on a line, which gets quoted by COPY
and not by psql.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-04-07 Thread Daniel Verite
Peter Eisentraut wrote:

> Another thought: Isn't CSV just the same as unaligned output plus some
> quoting?  Could we add a quote character setting and then define --csv
> to be quote-character = " and fieldsep = , ?

Plus footer set to off. So --csv would be like
\pset format unaligned
\pset fieldsep ','
\pset footer off
\pset unaligned_quote_character '"'

I guess we'd add a \csv command that does these together.

There would still be some discomfort with some of the issues
discussed upthread. For instance
  psql -F ';' --csv
is likely to reset fieldsep to ',' having then a different outcome than
  psql --csv -F';'

And there's the point on how to come back from \csv to another
output, given that it has overwritten "footer" that is used across
all formats, and fieldsep used by unaligned.
So a user might need to figure out anyway the
intricaties behind \csv, if it's not an independant format.

On the whole I'm inclined to resubmit the patch with
fieldsep_csv and some minor changes based on the rest
of the discussion.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-04-07 Thread Pavel Stehule
2018-04-07 13:55 GMT+02:00 Daniel Verite :

> Pavel Stehule wrote:
>
> > so we can to have
> >
> > \pset format xxx
> >
> > and set of local possibly changed defaults
> >
> > \pset csv_fieldsep ,
> > \pset csv_tuplesonly on
> > \pset unaligned_fieldsep |
> > \pset unaligned_tuplesonly off
>
> tuples_only (\t) is a single setting that is currently used by all
> formats. It makes sense as it is and I don't quite see what we
> would gain by "exploding" it.
>

It was a example, how the one default can be not good enough.

Usually we expect in align, unalign headers by default. But somebody can
expect tuples only by default for CSV format.


> There's also "footer" that goes in tandem with "tuples_only",
> to switch off the footer while keeping the header and title.
> Whatever change is considered to "tuples_only", "footer" must
> be considered with it.
> For the csv format, tuples_only=off is interpreted as "output the header"
> and tuples_only=on as "don't output the header". This is consistent
> with what other formats do.
>

My note was not about the implementations, it was about different
expectations of some users - looks on Isaac's mail, pls.



>
>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>


Re: csv format for psql

2018-04-07 Thread Daniel Verite
Pavel Stehule wrote:

> so we can to have
> 
> \pset format xxx
> 
> and set of local possibly changed defaults
> 
> \pset csv_fieldsep ,
> \pset csv_tuplesonly on
> \pset unaligned_fieldsep |
> \pset unaligned_tuplesonly off

tuples_only (\t) is a single setting that is currently used by all
formats. It makes sense as it is and I don't quite see what we
would gain by "exploding" it.
There's also "footer" that goes in tandem with "tuples_only",
to switch off the footer while keeping the header and title.
Whatever change is considered to "tuples_only", "footer" must
be considered with it.
For the csv format, tuples_only=off is interpreted as "output the header"
and tuples_only=on as "don't output the header". This is consistent
with what other formats do.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-04-06 Thread Jonathan S. Katz
Hi Pavel,

> On Apr 6, 2018, at 1:38 AM, Pavel Stehule  wrote:
> 
> 
> 
> 2018-04-06 5:46 GMT+02:00 Jonathan S. Katz  >:
> 
> > On Apr 5, 2018, at 11:08 PM, Peter Eisentraut 
> >  > > wrote:
> >
> > On 4/1/18 03:27, Pavel Stehule wrote:
> >> I don't share option so CSV format should be exactly same like CSV COPY.
> >> COPY is designed for backups - and header is not too important there.
> >> When I seen some csv, then there usually header was used.
> >
> > I think in practice a lot of people use COPY also because it's a nice
> > way to get CSV output, even if it's not for backups.  The options that
> > COPY has for CSV are clearly designed around making the output
> > compatible with various CSV-variants.
> 
> +1
> 
> From a user standpoint this was mostly how I use COPY.  Someone
> requests a report that they can manipulate in $SPREADSHEET.  I write
> a query, place it inside a “COPY” statement with FORMAT CSV,
> HEADER TRUE, save to file, and deliver.
> 
> > Another thought: Isn't CSV just the same as unaligned output plus some
> > quoting?  Could we add a quote character setting and then define --csv
> > to be quote-character = " and fieldsep = , ?
> 
> This makes a lot of sense. I’ve also generated CSV files using a
> combination of:
> 
> \f ,
> \a
> \o file.csv
> 
> and then running the query, but if any of the fields contained a “,” if would
> inevitably break in $SPREADSHEET.
> 
> unfortunately, most used CSV separator in Czech Repuplic is ; due MS Excel 
> settings for CR

Sure.  So in that case with the above, I would have used "\f ;” or
DELIMITER ‘;’ so it sounds like Peter’s suggestion is still valid.

Jonathan



Re: csv format for psql

2018-04-05 Thread Pavel Stehule
2018-04-06 5:46 GMT+02:00 Jonathan S. Katz :

>
> > On Apr 5, 2018, at 11:08 PM, Peter Eisentraut <
> peter.eisentr...@2ndquadrant.com> wrote:
> >
> > On 4/1/18 03:27, Pavel Stehule wrote:
> >> I don't share option so CSV format should be exactly same like CSV COPY.
> >> COPY is designed for backups - and header is not too important there.
> >> When I seen some csv, then there usually header was used.
> >
> > I think in practice a lot of people use COPY also because it's a nice
> > way to get CSV output, even if it's not for backups.  The options that
> > COPY has for CSV are clearly designed around making the output
> > compatible with various CSV-variants.
>
> +1
>
> From a user standpoint this was mostly how I use COPY.  Someone
> requests a report that they can manipulate in $SPREADSHEET.  I write
> a query, place it inside a “COPY” statement with FORMAT CSV,
> HEADER TRUE, save to file, and deliver.
>
> > Another thought: Isn't CSV just the same as unaligned output plus some
> > quoting?  Could we add a quote character setting and then define --csv
> > to be quote-character = " and fieldsep = , ?
>
> This makes a lot of sense. I’ve also generated CSV files using a
> combination of:
>
> \f ,
> \a
> \o file.csv
>
> and then running the query, but if any of the fields contained a “,” if
> would
> inevitably break in $SPREADSHEET.
>

unfortunately, most used CSV separator in Czech Repuplic is ; due MS Excel
settings for CR

Regards

Pavel

>
> Jonathan
>
>


Re: csv format for psql

2018-04-05 Thread Jonathan S. Katz

> On Apr 5, 2018, at 11:08 PM, Peter Eisentraut 
>  wrote:
> 
> On 4/1/18 03:27, Pavel Stehule wrote:
>> I don't share option so CSV format should be exactly same like CSV COPY.
>> COPY is designed for backups - and header is not too important there.
>> When I seen some csv, then there usually header was used.
> 
> I think in practice a lot of people use COPY also because it's a nice
> way to get CSV output, even if it's not for backups.  The options that
> COPY has for CSV are clearly designed around making the output
> compatible with various CSV-variants.

+1

From a user standpoint this was mostly how I use COPY.  Someone
requests a report that they can manipulate in $SPREADSHEET.  I write
a query, place it inside a “COPY” statement with FORMAT CSV,
HEADER TRUE, save to file, and deliver.

> Another thought: Isn't CSV just the same as unaligned output plus some
> quoting?  Could we add a quote character setting and then define --csv
> to be quote-character = " and fieldsep = , ?

This makes a lot of sense. I’ve also generated CSV files using a
combination of:

\f ,
\a
\o file.csv

and then running the query, but if any of the fields contained a “,” if would
inevitably break in $SPREADSHEET.

Jonathan




Re: csv format for psql

2018-04-05 Thread Peter Eisentraut
On 4/1/18 03:27, Pavel Stehule wrote:
> I don't share option so CSV format should be exactly same like CSV COPY.
> COPY is designed for backups - and header is not too important there.
> When I seen some csv, then there usually header was used.

I think in practice a lot of people use COPY also because it's a nice
way to get CSV output, even if it's not for backups.  The options that
COPY has for CSV are clearly designed around making the output
compatible with various CSV-variants.  If we create another way to do
CSV in psql, it would be totally confusing if some of the options are
the same and some are different or missing.

Another thought: Isn't CSV just the same as unaligned output plus some
quoting?  Could we add a quote character setting and then define --csv
to be quote-character = " and fieldsep = , ?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: csv format for psql

2018-04-01 Thread Pavel Stehule
2018-04-01 8:30 GMT+02:00 Fabien COELHO :

>
> Hello Isaac,
>
> Personnaly I do not have any problem with CSV defaulting to '|' separator,
>>> given that anyway people often use anything but a comma for the purpose,
>>> including '|'.
>>>
>>> However Pavel wants to block the patch on this point. Too bad.
>>>
>>
>> OK, mostly trying to avoid commenting because I doubt I have much to add.
>> But. If I ask for CSV and don't specify any overrides, I expect to get
>> "C"omma separated values, not some other character. More specifically, if I
>> say --csv I expect to get files that are identical with what I would get if
>> I used COPY ... CSV.
>>
>
> My summary was incomplete. The --csv option implementation by Daniel
> already does that.
>
> The issue Pavel is complaining about is that in interactive mode "\pset
> format csv" does not do the same: it triggers the csv-rule string-escaping
> mechanism, but does not reset the "fieldsep" variable (eh, it sets the
> "format" variable) so the default separator under this interactive use is
> "|" if the "fieldsep" variable is shared.
>
> I have suggested a "\csv" interactive command to set both as a convenient
> shorthand for "\pset format csv & \pset fieldsep ','", similarly to --csv,
> but Pavel still wants "\pset format csv" to trigger the full csv output.
>
> A consequence I forgot about adding "fieldsep_csv", is that it probably
> has to duplicate the "_zero" ugly hack to be consistent with existing
> "*sep" variables, or else be inconsistent. Sigh.
>

there are not any relation between fieldsep_csv and fieldsep_zero.

The base of this issue is in wrong concept of fieldsep. It is based on idea
so one value can be good enough for all formats. It is worked when all
formats that used this variable is ONE format - unaligned.

similar issue is with tuplesonly option - see Isaac's mail.

For different formats the different defaults are more consistent - the
other formats does it - doesn't use fieldsep - but it lost internal
consistency, because its does quietly.

We can have different ideas about users's expectations - but we should to
share opinion, so correct solution should be consistent. If I choose some
format, then behave should be same without dependency on activation
mechanism.

so \csv should be same like \pset format csv or --csv.

I don't share option so CSV format should be exactly same like CSV COPY.
COPY is designed for backups - and header is not too important there. When
I seen some csv, then there usually header was used.

If I can search some precedent, then it is \pset linestyle and \pset
unicode_*** options.

so we can to have

\pset format xxx

and set of local possibly changed defaults

\pset csv_fieldsep ,
\pset csv_tuplesonly on
\pset unaligned_fieldsep |
\pset unaligned_tuplesonly off

the fieldsep, tuplesonly can be just links to currently used _fieldsep
and _tuplesonly

This is consistent - defaults can be correct for Isaac, and I can paste to
my .psqlrc

\pset csv_tuplesonly off

and the semantic is clean, and this option will be active just for csv, and
doesn't depend on current format - so it can be used in .psqlrc

When it is works well for linestyle, then it can works for some other too

Regards

Pavel






>
> --
> Fabien.
>


Re: csv format for psql

2018-04-01 Thread Fabien COELHO


Hello Isaac,


Personnaly I do not have any problem with CSV defaulting to '|' separator,
given that anyway people often use anything but a comma for the purpose,
including '|'.

However Pavel wants to block the patch on this point. Too bad.


OK, mostly trying to avoid commenting because I doubt I have much to 
add. But. If I ask for CSV and don't specify any overrides, I expect to 
get "C"omma separated values, not some other character. More 
specifically, if I say --csv I expect to get files that are identical 
with what I would get if I used COPY ... CSV.


My summary was incomplete. The --csv option implementation by Daniel 
already does that.


The issue Pavel is complaining about is that in interactive mode "\pset 
format csv" does not do the same: it triggers the csv-rule string-escaping 
mechanism, but does not reset the "fieldsep" variable (eh, it sets the 
"format" variable) so the default separator under this interactive use is 
"|" if the "fieldsep" variable is shared.


I have suggested a "\csv" interactive command to set both as a convenient 
shorthand for "\pset format csv & \pset fieldsep ','", similarly to --csv, 
but Pavel still wants "\pset format csv" to trigger the full csv output.


A consequence I forgot about adding "fieldsep_csv", is that it probably 
has to duplicate the "_zero" ugly hack to be consistent with existing 
"*sep" variables, or else be inconsistent. Sigh.


--
Fabien.



Re: csv format for psql

2018-03-31 Thread Isaac Morland
On 31 March 2018 at 04:33, Fabien COELHO  wrote:

>
> Bonjour Daniel,
>
> For csv, Fabien and Peter expressed the opinion that we shouldn't
>> create another fieldsep-like variable specifically for it, but instead
>> reuse fieldsep. That's what my latest patch does.
>>
>> Now it turns out that sharing fieldsep comes with some problems.
>>
>
> Personnaly I do not have any problem with CSV defaulting to '|' separator,
> given that anyway people often use anything but a comma for the purpose,
> including '|'.
>
> However Pavel wants to block the patch on this point. Too bad.
>

OK, mostly trying to avoid commenting because I doubt I have much to add.
But. If I ask for CSV and don't specify any overrides, I expect to get
"C"omma separated values, not some other character. More specifically, if I
say --csv I expect to get files that are identical with what I would get if
I used COPY ... CSV. Actually, COPY ... CSV HEADER, given that psql shows
column headings. This also implies that I expect the quoting and related
details that are associated with CSV.

And I don't think I'm a weird user. If --csv does anything even a little
different from a simple COPY invocation on the same query, at some point
it's going to bite somebody and they will rightfully curse the design
decisions taken in this thread.


Re: csv format for psql

2018-03-31 Thread Gavin Flower

On 31/03/18 21:33, Fabien COELHO wrote:


Bonjour Daniel,


For csv, Fabien and Peter expressed the opinion that we shouldn't
create another fieldsep-like variable specifically for it, but instead
reuse fieldsep. That's what my latest patch does.

Now it turns out that sharing fieldsep comes with some problems.


Personnaly I do not have any problem with CSV defaulting to '|' 
separator, given that anyway people often use anything but a comma for 
the purpose, including '|'.


However Pavel wants to block the patch on this point. Too bad.

Maybe you can do (again) a patch with a 'fieldsep_csv' or whatever 
additional variable? Not sure about 'recordsep'...


And then we forward to committers both version and they can chose 
whatever they think best, including not committing anything?


Would suggest the choice of 'fieldsep_csv' out of those two (a record 
would correspond to the whole row), and the default being a comma.



Cheers,
Gavin




Re: csv format for psql

2018-03-31 Thread Fabien COELHO


Bonjour Daniel,


For csv, Fabien and Peter expressed the opinion that we shouldn't
create another fieldsep-like variable specifically for it, but instead
reuse fieldsep. That's what my latest patch does.

Now it turns out that sharing fieldsep comes with some problems.


Personnaly I do not have any problem with CSV defaulting to '|' separator, 
given that anyway people often use anything but a comma for the purpose, 
including '|'.


However Pavel wants to block the patch on this point. Too bad.

Maybe you can do (again) a patch with a 'fieldsep_csv' or whatever 
additional variable? Not sure about 'recordsep'...


And then we forward to committers both version and they can chose 
whatever they think best, including not committing anything?


--
Fabien.



Re: csv format for psql

2018-03-30 Thread Pavel Stehule
2018-03-30 0:15 GMT+02:00 David G. Johnston :

> On Thu, Mar 29, 2018 at 7:30 AM, Daniel Verite 
> wrote:
>
>> Personally I think the benefit of sharing fieldsep is not worth these
>> problems, but I'm waiting for the discussion to continue with
>> more opinions.
>
>
> ​Apologies in advance if I mis-represent someone's position.​
>
> ​It seems like having a dedicated option is the consensus opinion.  Daniel
> (the original author) and I both prefer it, Pavel will accept it.  Fabien​
> is opposed.
>
> Peter E. was opposed, wanting to leverage both fieldsep and recordsep, but
> hasn't chimed in recently.  His opinion at this point might push this over
> the edge since he is also a committer.
>
> I would probably suggest maybe just calling it "\pset separator" to match
> the "S" in "cSv" and not have any name overlap with the fieldsep variable
> used with unaligned mode.  The user will have to learn anything and being
> more distinct should help the process.  We would not consult recordsep
> though the end-of-line choice should be platform dependent.
>

-1. The difference between fieldsep and separator is not clear, and the
relation between separator and csv is not clean too.

fieldsep_csv or csv_fieldsep is not nice, but it is clear.

Regards

Pavel


>
> David J.
>
>


Re: csv format for psql

2018-03-29 Thread David G. Johnston
On Thu, Mar 29, 2018 at 7:30 AM, Daniel Verite 
wrote:

> Personally I think the benefit of sharing fieldsep is not worth these
> problems, but I'm waiting for the discussion to continue with
> more opinions.


​Apologies in advance if I mis-represent someone's position.​

​It seems like having a dedicated option is the consensus opinion.  Daniel
(the original author) and I both prefer it, Pavel will accept it.  Fabien​
is opposed.

Peter E. was opposed, wanting to leverage both fieldsep and recordsep, but
hasn't chimed in recently.  His opinion at this point might push this over
the edge since he is also a committer.

I would probably suggest maybe just calling it "\pset separator" to match
the "S" in "cSv" and not have any name overlap with the fieldsep variable
used with unaligned mode.  The user will have to learn anything and being
more distinct should help the process.  We would not consult recordsep
though the end-of-line choice should be platform dependent.

David J.


Re: csv format for psql

2018-03-29 Thread Daniel Verite
David G. Johnston wrote:

> Unaligned format could grow its own fieldsep if it wanted to but it can
> also just use the default provided fieldsep var whose default value is
> pipe.  If it did grow one it would need to understand "not set" in order to
> preserve existing behavior.  We don't have that problem with csv.

fielsep is already owned by the unaligned format. No other format
uses fieldsep currently. Why would it needs to grow its own?
I don't quite see what you mean here.

For csv, Fabien and Peter expressed the opinion that we shouldn't
create another fieldsep-like variable specifically for it, but instead
reuse fieldsep. That's what my latest patch does.

Now it turns out that sharing fieldsep comes with some problems.

1. since a single variable can't have two different default values,
we have either to accept '|' as a default csv separator, or introduce
some tricks to automagically commute the value.

2. when a user does \pset fieldsep ';' there are doubts
on whether this should affect the current mode only, or both,
or even the other one when the user goes back and forth between
formats. Any proposal where a \pset var value affects by
side-effect another pset variable is inconsistent with what has been
done until now with these variables, and I expect that
side-effects are not exactly sought after in general.

3. the command-line cannot express: use character 'X' for unaligned
and 'Y' for csv for the rest of the session. The current patch provides
'--csv' to select both the csv format and a comma separator, but
when combined with -F, the end result depend on the relative position
of the options, which may feel unintuitive or buggy.
Again we could tweak that, but only by being inconsistent
with how we process other options.

Personally I think the benefit of sharing fieldsep is not worth these
problems, but I'm waiting for the discussion to continue with
more opinions.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-03-29 Thread Daniel Verite
Isaac Morland wrote:

> The actual reason I'm posting this is because some of the discussion has
> made me a bit confused: there is already a CSV format defined for the COPY
> command and used by the psql \copy. I just want to check that what is being
> discussed here would use the exact same format as the existing CSV COPY

Please see the start of the thread at
https://www.postgresql.org/message-id/a8de371e-006f-4f92-ab72-2bbe3ee78f03%40manitou-mail.org
where what is proposed is compared to what \copy and COPY
already provide.

About the CSV acronym itself, the COPY documentation
https://www.postgresql.org/docs/current/static/sql-copy.html
already refers to it as "Comma Separated Values", even though
as we know, the delimiter is user-configurable.
There's no difference in psql that would justify a different wording.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-03-29 Thread Pavel Stehule
2018-03-29 14:17 GMT+02:00 Daniel Verite :

> David G. Johnston wrote:
>
> > Or, really, just make --csv take an optional argument which, if present,
> is
> > the delimiter.
>
> I don't think psql can support optional arguments because
> psql --option foo
> would be ambiguous, since foo could be the option's value or
> the name of a database to connect to.
>

sure

Pavel


>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>


Re: csv format for psql

2018-03-29 Thread Daniel Verite
David G. Johnston wrote:

> Or, really, just make --csv take an optional argument which, if present, is
> the delimiter.

I don't think psql can support optional arguments because
psql --option foo
would be ambiguous, since foo could be the option's value or
the name of a database to connect to.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-03-29 Thread Daniel Verite
David G. Johnston wrote:

> Could someone post how captions, rows-only, and footer pset settings factor
> into this?  Specifically are they fixed to on/off or will they hide/show if
> users request them explicitly?

This is described in the doc with:

+  csv format writes columns separated
+  by fieldsep, applying the CSV quoting rules
+  described in RFC-4180 and compatible with the CSV format
+  of the COPY command.
+  The header with column names is output unless the
+  tuples_only parameter is on.
+  Title and footers are not printed.
+  


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-03-28 Thread David G. Johnston
On Wed, Mar 28, 2018 at 4:19 PM, Isaac Morland 
wrote:

> On 28 March 2018 at 15:43, Joshua D. Drake  wrote:
>
>> On 03/28/2018 12:35 PM, David G. Johnston wrote:
>>
>> I like to call it "Character Separated Values" now for just that reason.
>>
>>
>> Isn't the actual wording Character Delimited Values? I may be picking at
>> hairs here but every single time I use anything to import a CSV or other
>> delimited file (TAB or | usually) that is what the import screen says.
>>
>
> CSV stands for Comma Separated Values, and not anything else common as far
> as I can tell.
>

​Language evolves - Wikipedia just hasn't recognized this particular
evolution yet :)

The actual reason I'm posting this is because some of the discussion has
> made me a bit confused: there is already a CSV format defined for the COPY
> command and used by the psql \copy. I just want to check that what is being
> discussed here would use the exact same format as the existing CSV COPY
> format; and the configurability of them should be the exact same options
> (which already includes being able to change the delimiter).
>

​Nope, the \copy meta-command, and corresponding COPY SQL command, have
additional options that I don't believe will be accessible here.​
Specifically the "OIDS", "QUOTE", and various "FORCE_" options are not
controllable, nor are ESCAPE and ENCODING (this last I think, or at least
not as flexible).


> I think it's important that Postgres not have two CSVs with slightly
> different behaviours. Indeed, psql could use COPY behind the scenes to
> generate the CSV output, which would guarantee no nasty surprises.
>
> If this is already widely accepted or if I'm horribly misunderstanding the
> discussion then I'm sorry for the noise.
>

​I agree this is a possible approach but the way the thinking is presently
is that this is just another "\pset format" option with forced defaults
excepting the delimiter (DELIMITER),presence of the column names (HEADER)​,
NULL (I think...)

​Issuing \copy with the equivalent settings should result in the same
output...but the two mechanisms are distinct for better and worse.

David J.


Re: csv format for psql

2018-03-28 Thread Isaac Morland
On 28 March 2018 at 15:43, Joshua D. Drake  wrote:

> On 03/28/2018 12:35 PM, David G. Johnston wrote:
>
> I like to call it "Character Separated Values" now for just that reason.
>
>
> Isn't the actual wording Character Delimited Values? I may be picking at
> hairs here but every single time I use anything to import a CSV or other
> delimited file (TAB or | usually) that is what the import screen says.
>

CSV stands for Comma Separated Values, and not anything else common as far
as I can tell. A Google search for "csv" turns up the Wikipedia page
describing the file format as the first hit, followed by the Wikipedia
disambiguation page for CSV, which links to the aforementioned Wikipedia
page as the only file-format-related link.

The actual reason I'm posting this is because some of the discussion has
made me a bit confused: there is already a CSV format defined for the COPY
command and used by the psql \copy. I just want to check that what is being
discussed here would use the exact same format as the existing CSV COPY
format; and the configurability of them should be the exact same options
(which already includes being able to change the delimiter). I think it's
important that Postgres not have two CSVs with slightly different
behaviours. Indeed, psql could use COPY behind the scenes to generate the
CSV output, which would guarantee no nasty surprises.

If this is already widely accepted or if I'm horribly misunderstanding the
discussion then I'm sorry for the noise.


Re: csv format for psql

2018-03-28 Thread David G. Johnston
On Wednesday, March 28, 2018, Fabien COELHO  wrote:

>
>
> And if we introduce csv-specific fieldsep, then we multiply this wrong
>> design. The fix in this direction is renaming fieldsep to fieldsep-unaliagn
>> - but it is probably too big change too. So this design is nothing what I
>> can mark as good solution.
>>
>
> Good, we somehow agree on something!
>
> I can live with because it is much better than using pipe as separator for
>> csv, and because real impact is small - for almost people it will be
>> invisible - but have not good feeling from it.
>
>
Concretely...I'm in favor of the "\pset fieldsep_csv ," solution and csv
format should always use its existing value.  Teach \pset fieldsep to fail
if the current format is csv.  Being able to specify the csv fieldsep like
 "\pset format csv ;" would be a plus.

Unaligned format could grow its own fieldsep if it wanted to but it can
also just use the default provided fieldsep var whose default value is
pipe.  If it did grow one it would need to understand "not set" in order to
preserve existing behavior.  We don't have that problem with csv.

I don't believe we can modify fieldsep without causing unwanted grief.

David J.


Re: csv format for psql

2018-03-28 Thread Joshua D. Drake

On 03/28/2018 12:35 PM, David G. Johnston wrote:
On Monday, March 26, 2018, Daniel Verite > wrote:



We could even support only the comma and make it non-configurable
based on the fact it's Comma-Separated-Values, not
Whatever-Separated-Values, except that won't do much
to serve the users interests, as the reality is that
people use various separators.


I like to call it "Character Separated Values" now for just that reason.


Isn't the actual wording Character Delimited Values? I may be picking at 
hairs here but every single time I use anything to import a CSV or other 
delimited file (TAB or | usually) that is what the import screen says.


JD



David J.



--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc
***  A fault and talent of mine is to tell it exactly how it is.  ***
PostgreSQL centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://postgresconf.org
* Unless otherwise stated, opinions are my own.   *



Re: csv format for psql

2018-03-28 Thread David G. Johnston
On Monday, March 26, 2018, Daniel Verite  wrote:

>
> We could even support only the comma and make it non-configurable
> based on the fact it's Comma-Separated-Values, not
> Whatever-Separated-Values, except that won't do much
> to serve the users interests, as the reality is that
> people use various separators.
>

I like to call it "Character Separated Values" now for just that reason.

David J.


Re: csv format for psql

2018-03-28 Thread David G. Johnston
On Wednesday, March 28, 2018, David G. Johnston 
wrote:

> On Wednesday, March 28, 2018, Pavel Stehule 
> wrote:
>
>>
>> Are there some possible alternatives?

>>>
>>> Given the date and the fact that the cf end is 3 days away, the proposed
>>> short term alternative is Daniel's version, that I feel is reasonable. Ok,
>>> people have to do two pset to get comma-separated csv, otherwise they get
>>> pipe-separated csv in one pset.
>>>
>>
>>
> Could someone post how captions, rows-only, and footer pset settings
> factor into this?  Specifically are they fixed to on/off or will they
> hide/show if users request them explicitly?
>
>
I found the original post that covers that indeed we simply fix these
values, which is why I was thinking.

 and something like --csv-fieldsep should be provided (I like the prefixing
> to tie the option lexically to the master --csv option).
>

Or, really, just make --csv take an optional argument which, if present, is
the delimiter.  No separate argument needed, and we ignore the pset
fieldsep argument like we ignore everything else.

Need to decide whether to "not ignore" --tuples-only, which doesn't seem
controversial aside from being a pset argument that isn't being ignored in
this design...

David J.


Re: csv format for psql

2018-03-28 Thread David G. Johnston
On Wednesday, March 28, 2018, Pavel Stehule  wrote:

>
> Are there some possible alternatives?
>>>
>>
>> Given the date and the fact that the cf end is 3 days away, the proposed
>> short term alternative is Daniel's version, that I feel is reasonable. Ok,
>> people have to do two pset to get comma-separated csv, otherwise they get
>> pipe-separated csv in one pset.
>>
>
>
Could someone post how captions, rows-only, and footer pset settings factor
into this?  Specifically are they fixed to on/off or will they hide/show if
users request them explicitly?

My take on this is that --csv mode is/should be an alternate output mode
from the existing pset controlled one, and functions basically like "\copy
to stdout" and all echoing and metadata outputs are disabled and only query
results, with header and the user specified delimiter, are output.  No
control other than the delimiter seems to be provided in the current design
but that could be expanded upon.  In that specification the existing
fieldsep argument that is tied to pset should not be used and something
like --csv-fieldsep should be provided (I like the prefixing to tie the
option lexically to the master --csv option).

David J.


Re: Re: csv format for psql

2018-03-28 Thread Pavel Stehule
2018-03-28 10:24 GMT+02:00 Fabien COELHO :

>
> Hello Pavel,
>
> I'd like to convince you to compromise, because otherwise I'm afraid we
>>> will not get this feature.
>>>
>>
>> [...]
>>
>>>
>>> The only no-surprise, no-behavioral-change, alternative I see is to have
>>> a
>>> csv-specific fieldsep. I'm not keen on that one because it is yet another
>>> variable, one has to remember it, and then it asks the question about
>>> recordsep... and I think there are already too many parameters, and more
>>> is
>>> not desirable, although I probably could live with that if I can live
>>> with
>>> "fieldsep_zero":-)
>>>
>>>
>>> I don't share your opinion so introduction csv-specific fieldsep is
>> surprise less. Current design is wrong - this thread is a evidence.
>>
>
> Wrong is too strong a word. Current design is not perfect, sure. Proposed
> alternatives are somehow worse, at least to some people mind, which
> explains why we arrived on this version after a few iterations.
>
> And if we introduce csv-specific fieldsep, then we multiply this wrong
>> design. The fix in this direction is renaming fieldsep to fieldsep-unaliagn
>> - but it is probably too big change too. So this design is nothing what I
>> can mark as good solution.
>>
>
> Good, we somehow agree on something!
>
> I can live with because it is much better than using pipe as separator for
>> csv, and because real impact is small - for almost people it will be
>> invisible - but have not good feeling from it.
>>
>
> Are there some possible alternatives?
>>
>
> Given the date and the fact that the cf end is 3 days away, the proposed
> short term alternative is Daniel's version, that I feel is reasonable. Ok,
> people have to do two pset to get comma-separated csv, otherwise they get
> pipe-separated csv in one pset.
>

I dislike the last Daniel's design. I am sorry.


> You could compromise on that for now, and submit an improvement patch for
> a later version if you wish.
>

I am able to accept csv specific field sep independent on unaligned field
sep.

I have not any other idea. And there is not any good (acceptable) ideas. It
is hard to expect so there will be change next year. This space is small,
and there are not too much possible variants. We checked all possibility
without any agreement.


>
> Otherwise, ISTM that the current result of this thread is that there will
> be no simple CSV in pg11:-(
>

Can be. If there is not good enough solution now. If we merge it now, then
will be hard to change it due possible compatibility issues.


>
> Or maybe I can mark Daniel's latest version as "ready" and point out that
> there is some disagreement on the thread, so it is not a full consensus.
> Then a committer can decide whether it is better in like that or that it
> should be put back in some design stage, possibly with their preference wrt
> to the kind of solution they think best.


You can do it. But from my view the Daniel's latest version (with respect
to his work) is the worst variant :(. So I am against to commit this
version.

Regards

Pavel



>
> --
> Fabien.
>


Re: Re: csv format for psql

2018-03-27 Thread Pavel Stehule
2018-03-27 11:35 GMT+02:00 Fabien COELHO :

>
> Hello Pavel,
>
> I'd like to convince you to compromise, because otherwise I'm afraid we
> will not get this feature.
>
> 1. use special default string for formats that doesn't field sep - like
>>>
 "not used" or some
 2. I can implemet the option fieldsep_default - very similary to
 fieldsep_zero to reset fieldsep to default state.


>>> I strongly dislike option 2, as expressed above. I would
>>> enthousiastically
>>> review any patch that would aim are removing these "*_zero" options. I
>>> might submit it someday.
>>>
>>>
>> I can remove it simply - but a alternative is implementation of some
>> \pset_reset command maybe:
>>
>> \pset reset fieldsep
>>
>> what do you think?
>>
>
> I think that changing the semantics of \pset is a nonstarter, it should be
> only about "output [p]arameter [set]ting", because it has been like that
> for the last XX years and people are used to it.
>
> I'm more unclear about option 1. Maybe it could be managed cleanly.
>>>
>>> I'm still at odds that it would mean that \pset would not show the actual
>>> setting anymore, but something harder to explain, "actual value or some
>>> format-specific value, depending".
>>>
>>
>> It can be formulated little bit different - "when a value of field
>> separator is not entered, then format specific default is used (if can be
>> specified - some formats doesn't allow to specify field separator)."
>>
>
> Well, that is 3 lines of explanations where people thought they were just
> setting a simple variable to a simple value, or showing the actual current
> value which would be used if needed.
>
> My opinion is that some of what you are suggesting could have participated
> to an alternate and interesting output-parameters-for-psql design, but we
> are a much too late to change that.
>
> The purpose of the patch is just to enable having clean CSV quite easily
> from psql, possibly for pg11... Changing the design and the underlying user
> visible behavior would require a much wider and difficult to obtain
> consensus, and is very unlikely to get in for pg11, if ever.
>
> The current version allows "--csv" or need two "\pset" to achieve CSV,
> without changing the preceding behavior, however unperfect it is...
>
> Could you live with that with the benefit of getting it in? I do not claim
> it is a perfect solution, just that it is a reasonable one.


> The dynamic default changing approach departs too much for the current
> user expectations, the user-benefit is not worth it, and committers are
> very likely to think like that.
>
> The only no-surprise, no-behavioral-change, alternative I see is to have a
> csv-specific fieldsep. I'm not keen on that one because it is yet another
> variable, one has to remember it, and then it asks the question about
> recordsep... and I think there are already too many parameters, and more is
> not desirable, although I probably could live with that if I can live with
> "fieldsep_zero":-)
>
>
I don't share your opinion so introduction csv-specific fieldsep is
surprise less. Current design is wrong - this thread is a evidence. And if
we introduce csv-specific fieldsep, then we multiply this wrong design. The
fix in this direction is renaming fieldsep to fieldsep-unaliagn - but it is
probably too big change too. So this design is nothing what I can mark as
good solution. I can live with because it is much better than using pipe as
separator for csv, and because real impact is small - for almost people it
will be invisible - but have not good feeling from it. Are there some
possible alternatives?

Regards

Pavel




> --
> Fabien.
>


Re: Re: csv format for psql

2018-03-27 Thread Fabien COELHO


Hello Pavel,

I'd like to convince you to compromise, because otherwise I'm afraid we 
will not get this feature.



1. use special default string for formats that doesn't field sep - like

"not used" or some
2. I can implemet the option fieldsep_default - very similary to
fieldsep_zero to reset fieldsep to default state.



I strongly dislike option 2, as expressed above. I would enthousiastically
review any patch that would aim are removing these "*_zero" options. I
might submit it someday.



I can remove it simply - but a alternative is implementation of some
\pset_reset command maybe:

\pset reset fieldsep

what do you think?


I think that changing the semantics of \pset is a nonstarter, it should 
be only about "output [p]arameter [set]ting", because it has been like 
that for the last XX years and people are used to it.



I'm more unclear about option 1. Maybe it could be managed cleanly.

I'm still at odds that it would mean that \pset would not show the actual
setting anymore, but something harder to explain, "actual value or some
format-specific value, depending".


It can be formulated little bit different - "when a value of field
separator is not entered, then format specific default is used (if can be
specified - some formats doesn't allow to specify field separator)."


Well, that is 3 lines of explanations where people thought they were just 
setting a simple variable to a simple value, or showing the actual current 
value which would be used if needed.


My opinion is that some of what you are suggesting could have participated 
to an alternate and interesting output-parameters-for-psql design, but we 
are a much too late to change that.


The purpose of the patch is just to enable having clean CSV quite easily 
from psql, possibly for pg11... Changing the design and the underlying 
user visible behavior would require a much wider and difficult to obtain 
consensus, and is very unlikely to get in for pg11, if ever.


The current version allows "--csv" or need two "\pset" to achieve CSV, 
without changing the preceding behavior, however unperfect it is...


Could you live with that with the benefit of getting it in? I do not claim 
it is a perfect solution, just that it is a reasonable one.


The dynamic default changing approach departs too much for the current 
user expectations, the user-benefit is not worth it, and committers are 
very likely to think like that.


The only no-surprise, no-behavioral-change, alternative I see is to have a 
csv-specific fieldsep. I'm not keen on that one because it is yet another 
variable, one has to remember it, and then it asks the question about 
recordsep... and I think there are already too many parameters, and more 
is not desirable, although I probably could live with that if I can live 
with "fieldsep_zero":-)


--
Fabien.



Re: Re: csv format for psql

2018-03-26 Thread Pavel Stehule
2018-03-26 20:26 GMT+02:00 Daniel Verite :

> Pavel Stehule wrote:
>
> > b) the list of pset options is bloating - every possible new format can
> > introduce fieldsep_X option
>
> What new format?
> The usefulness of fieldsep does not extend outside of xSV, and it's
> no suprise that there have been no other use for a fieldsep-like
> variable until now, with the other formats supported in psql.
>

I can imagine format based on \t as field separator, or some other
invisible chars.


>
> In fact it's even absurd for any format to use a non-fixed separator
> at a place that is key for being parsed unambiguously.
>
> We could even support only the comma and make it non-configurable
> based on the fact it's Comma-Separated-Values, not
> Whatever-Separated-Values, except that won't do much
> to serve the users interests, as the reality is that
> people use various separators.
>

I don't think so allow only comma as separator is the solution.

Regards

Pavel


>
>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>


Re: Re: csv format for psql

2018-03-26 Thread Daniel Verite
Pavel Stehule wrote:

> b) the list of pset options is bloating - every possible new format can
> introduce fieldsep_X option

What new format?
The usefulness of fieldsep does not extend outside of xSV, and it's
no suprise that there have been no other use for a fieldsep-like
variable until now, with the other formats supported in psql.

In fact it's even absurd for any format to use a non-fixed separator
at a place that is key for being parsed unambiguously.

We could even support only the comma and make it non-configurable
based on the fact it's Comma-Separated-Values, not
Whatever-Separated-Values, except that won't do much
to serve the users interests, as the reality is that
people use various separators.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: Re: csv format for psql

2018-03-26 Thread Pavel Stehule
2018-03-26 14:44 GMT+02:00 Daniel Verite :

> Pavel Stehule wrote:
>
> > implemented in attached patch
>
> Consider your initial output of \pset, when no option is specified
> =
> $ ./psql
> psql (11devel)
> Type "help" for help.
>
> postgres=# \pset
> border   1
> columns  0
> expanded off
> fieldsep not used
> fieldsep_zeroERROR
> footer   on
> format   aligned
> linestyleascii
> null ''
> numericlocaleoff
> pager1
> pager_min_lines  0
> recordsep'\n'
> recordsep_zero   off
> resetERROR
> tableattr
> title
> tuples_only  off
> unicode_border_linestyle single
> unicode_column_linestyle single
> unicode_header_linestyle single
> 
>
> These entries with ERROR correspond in fact to no error at all,
> or we have to pretend that the default state of psql is erroneous,
> which doesn't make sense.
>
> Also "reset" is not a variable, it seems to be a command,
> so it probably shouldn't be there in the first place.
>

fixed


>
> More generally, I'd think the point of reusing "fieldsep" was to
> reuse the concept, not reimplement it, let alone changing
> bits of behavior of the unaligned mode along the way.
>
> With this patch, again without specifying any option, just looking
> at what fieldsep is leads to this:
>
> postgres=# \pset fieldsep
> User didn't specified field separator.
> Current format doesn't specify default field separator.
>
> If this is the way to "solve" the fact that a user has to do
>  \pset fieldsep ','
> to get commas in csv mode, then IMV the proposed solution
> is clearly worse than the stated problem, and worse than
> simply adding fieldsep_csv to be independant from the
> unaligned mode.
>
>
I don't understand what is wrong there? There can be any message (short -
just "unset",...). I understand so default visual can look strange, because
aligned mode has not default field separator, but I don't see any other
possibility.

Can I do some recapitulation:

1. using CSV with default | as field separator is wrong - probably there is
a agreement

2. partial solution is fieldsep_X where X will be format name. It is better
than using | for csv, but it introduces new problems:

a) there are not new long, short options for this value

b) the list of pset options is bloating - every possible new format can
introduce fieldsep_X option

c) can be messy for people, because the setting fieldsep can has zero
effect on csv or some other formats that don't share default with unaligned
format.

So my position - I am very strong against to introduce CSV format with | as
field separator, and I am not happy if we introduce fieldsep_X like
options, because it is not too good too.

Regards

Pavel



>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 10b97950ec..c984a9cbaa 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -151,6 +151,16 @@ EOF
   
 
 
+
+  --csv
+  
+  
+  Switches to csv output mode. This is equivalent to \pset format
+  csv followed by \pset fieldsep ','.
+  
+  
+
+
 
   -d dbname
   --dbname=dbname
@@ -246,7 +256,7 @@ EOF
   
   
   Use separator as the
-  field separator for unaligned output. This is equivalent to
+  field separator for unaligned and csv outputs. This is equivalent to
   \pset fieldsep or \f.
   
   
@@ -382,7 +392,7 @@ EOF
   
   
   Use separator as the
-  record separator for unaligned output. This is equivalent to
+  record separator for unaligned and csv outputs. This is equivalent to
   \pset recordsep.
   
   
@@ -558,7 +568,7 @@ EOF
   
   
   Set the field separator for unaligned output to a zero byte.  This is
-  equvalent to \pset fieldsep_zero.
+  equivalent to \pset fieldsep_zero.
   
   
 
@@ -1937,9 +1947,9 @@ Tue Oct 26 21:40:57 CEST 1999
 
 
 
-Sets the field separator for unaligned query output. The default
-is the vertical bar (|). It is equivalent to
-\pset fieldsep.
+Sets the field separator for unaligned and csv query outputs. The
+default is the vertical bar (|). It is equivalent
+to \pset fieldsep.
 
 
   
@@ -2546,8 +2556,8 @@ lo_import 152801
   fieldsep
   
   
-  Specifies the field separator to be used in unaligned output
-  format. That way one can create, for example, tab- or
+  Specifies the field separator to be used in unaligned and csv output
+  formats. That way one 

Re: Re: csv format for psql

2018-03-26 Thread Daniel Verite
Pavel Stehule wrote:

> implemented in attached patch

Consider your initial output of \pset, when no option is specified
=
$ ./psql 
psql (11devel)
Type "help" for help.

postgres=# \pset
border   1
columns  0
expanded off
fieldsep not used
fieldsep_zeroERROR
footer   on
format   aligned
linestyleascii
null ''
numericlocaleoff
pager1
pager_min_lines  0
recordsep'\n'
recordsep_zero   off
resetERROR
tableattr
title
tuples_only  off
unicode_border_linestyle single
unicode_column_linestyle single
unicode_header_linestyle single


These entries with ERROR correspond in fact to no error at all,
or we have to pretend that the default state of psql is erroneous,
which doesn't make sense.

Also "reset" is not a variable, it seems to be a command,
so it probably shouldn't be there in the first place.

More generally, I'd think the point of reusing "fieldsep" was to 
reuse the concept, not reimplement it, let alone changing
bits of behavior of the unaligned mode along the way.

With this patch, again without specifying any option, just looking
at what fieldsep is leads to this:

postgres=# \pset fieldsep
User didn't specified field separator.
Current format doesn't specify default field separator.

If this is the way to "solve" the fact that a user has to do
 \pset fieldsep ','
to get commas in csv mode, then IMV the proposed solution
is clearly worse than the stated problem, and worse than
simply adding fieldsep_csv to be independant from the
unaligned mode.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: Re: csv format for psql

2018-03-25 Thread Pavel Stehule
2018-03-25 18:02 GMT+02:00 Pavel Stehule :

>
>
> 2018-03-25 15:21 GMT+02:00 Fabien COELHO :
>
>>
>>
>> Hello Pavel,
>>
>> [...]

>>> it is correct. Default format is aligned, that doesn't use fieldsep.
>>>
>>
>> My comment is that currently fieldsep is kind of a variable, the value of
>> which is displayed and reliable wrt commands executed afterwards, and the
>> proposed approach changes that by adding a new "default" state.
>>
>> [...]

>>> There should be similar mechanism like fieldsep_zero, that reset
>>> settings.
>>> Some like fieldsep_default.
>>>
>>
>> The user-visible fieldsep_zero is a horrible hack which exposes the fact
>> that C strings cannot hold character zero because it is the end of string
>> marker. The user should not have to care! I wish it would be simply
>> removed, and that "\pset fieldsep '\0'" or something like that would do the
>> trick.
>>
>> Having another one of these, which would create 4 boolean states which
>> would have to be interpreted (eg if default is true and zero is true and
>> sep is ';', and format is csv, what will I get?) by the user to guess what
>> is going to happen looks unhelpful.
>>
>> For me adding another user-visible fieldsep-related boolean a no go.
>>
>> Also, if I do "\pset fielsep ''" then the \pset output does not allow to
 distinguish between the unset state and set to empty state.

>>>
>>> This is question - how to fieldsep be related to current format. Aligned
>>> format doesn't use fieldsep.
>>>
>>
>> For me \pset reflects the current value of output options. With the
>> proposed approach it does not mean that anymore, as already said above.
>>
>> I would not expect a committer to accept such a strange no-possible-return
 to previous state because of a hidden state (whether fieldsep has been
 set
 or not in the past) behavior.

>>>
>>> This design is very similar to already implemented fieldsep_zero - look
>>> to
>>> code. It is nothing new.
>>>
>>
>> Alas you are right. I suggest not to go on the same path again:-)
>>
>> So I can do better?
>>>
>>
>> Dunno. Possibly.
>>
>> 1. use special default string for formats that doesn't field sep - like
>>> "not used" or some
>>> 2. I can implemet the option fieldsep_default - very similary to
>>> fieldsep_zero to reset fieldsep to default state.
>>>
>>
>> I strongly dislike option 2, as expressed above. I would
>> enthousiastically review any patch that would aim are removing these
>> "*_zero" options. I might submit it someday.
>>
>
> I can remove it simply - but a alternative is implementation of some
> \pset_reset command maybe:
>
> \pset reset fieldsep
>
> what do you think?
>

implemented in attached patch

>
>
>>
>> I'm more unclear about option 1. Maybe it could be managed cleanly.
>>
>> I'm still at odds that it would mean that \pset would not show the actual
>> setting anymore, but something harder to explain, "actual value or some
>> format-specific value, depending".
>>
>
> It can be formulated little bit different - "when a value of field
> separator is not entered, then format specific default is used (if can be
> specified - some formats doesn't allow to specify field separator)."
>

it can be clean from following result

 postgres=# \pset fieldsep
User didn't specified field separator.
Current format doesn't specify default field separator.
postgres=# \pset fieldsep ;
Field separator is ";".
postgres=# \pset fieldsep
Field separator is ";".

fieldsep_default was removed

Regards

Pavel



> Regards
>
> Pavel
>
>
>>
>> --
>> Fabien.
>>
>
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 10b97950ec..c984a9cbaa 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -151,6 +151,16 @@ EOF
   
 
 
+
+  --csv
+  
+  
+  Switches to csv output mode. This is equivalent to \pset format
+  csv followed by \pset fieldsep ','.
+  
+  
+
+
 
   -d dbname
   --dbname=dbname
@@ -246,7 +256,7 @@ EOF
   
   
   Use separator as the
-  field separator for unaligned output. This is equivalent to
+  field separator for unaligned and csv outputs. This is equivalent to
   \pset fieldsep or \f.
   
   
@@ -382,7 +392,7 @@ EOF
   
   
   Use separator as the
-  record separator for unaligned output. This is equivalent to
+  record separator for unaligned and csv outputs. This is equivalent to
   \pset recordsep.
   
   
@@ -558,7 +568,7 @@ EOF
   
   
   Set the field separator for unaligned output to a zero byte.  This is
-  equvalent to \pset fieldsep_zero.
+  equivalent to \pset fieldsep_zero.
   
   
 
@@ -1937,9 +1947,9 @@ Tue Oct 26 21:40:57 CEST 1999
 
 
 
-Sets the field separator for unaligned query output. The default
-is the vertical bar (|). It is equivalent to
-\pset 

Re: Re: csv format for psql

2018-03-25 Thread Pavel Stehule
2018-03-25 15:21 GMT+02:00 Fabien COELHO :

>
>
> Hello Pavel,
>
> [...]
>>>
>> it is correct. Default format is aligned, that doesn't use fieldsep.
>>
>
> My comment is that currently fieldsep is kind of a variable, the value of
> which is displayed and reliable wrt commands executed afterwards, and the
> proposed approach changes that by adding a new "default" state.
>
> [...]
>>>
>> There should be similar mechanism like fieldsep_zero, that reset settings.
>> Some like fieldsep_default.
>>
>
> The user-visible fieldsep_zero is a horrible hack which exposes the fact
> that C strings cannot hold character zero because it is the end of string
> marker. The user should not have to care! I wish it would be simply
> removed, and that "\pset fieldsep '\0'" or something like that would do the
> trick.
>
> Having another one of these, which would create 4 boolean states which
> would have to be interpreted (eg if default is true and zero is true and
> sep is ';', and format is csv, what will I get?) by the user to guess what
> is going to happen looks unhelpful.
>
> For me adding another user-visible fieldsep-related boolean a no go.
>
> Also, if I do "\pset fielsep ''" then the \pset output does not allow to
>>> distinguish between the unset state and set to empty state.
>>>
>>
>> This is question - how to fieldsep be related to current format. Aligned
>> format doesn't use fieldsep.
>>
>
> For me \pset reflects the current value of output options. With the
> proposed approach it does not mean that anymore, as already said above.
>
> I would not expect a committer to accept such a strange no-possible-return
>>> to previous state because of a hidden state (whether fieldsep has been
>>> set
>>> or not in the past) behavior.
>>>
>>
>> This design is very similar to already implemented fieldsep_zero - look to
>> code. It is nothing new.
>>
>
> Alas you are right. I suggest not to go on the same path again:-)
>
> So I can do better?
>>
>
> Dunno. Possibly.
>
> 1. use special default string for formats that doesn't field sep - like
>> "not used" or some
>> 2. I can implemet the option fieldsep_default - very similary to
>> fieldsep_zero to reset fieldsep to default state.
>>
>
> I strongly dislike option 2, as expressed above. I would enthousiastically
> review any patch that would aim are removing these "*_zero" options. I
> might submit it someday.
>

I can remove it simply - but a alternative is implementation of some
\pset_reset command maybe:

\pset reset fieldsep

what do you think?


>
> I'm more unclear about option 1. Maybe it could be managed cleanly.
>
> I'm still at odds that it would mean that \pset would not show the actual
> setting anymore, but something harder to explain, "actual value or some
> format-specific value, depending".
>

It can be formulated little bit different - "when a value of field
separator is not entered, then format specific default is used (if can be
specified - some formats doesn't allow to specify field separator)."

Regards

Pavel


>
> --
> Fabien.
>


Re: Re: csv format for psql

2018-03-24 Thread Pavel Stehule
2018-03-24 10:06 GMT+01:00 Fabien COELHO <coe...@cri.ensmp.fr>:

>
> Hello Pavel,
>
> I'm suggesting to add \csv which would behave like \H to toggle CSV
>>>> mode so as to improve this situation, with a caveat which is that
>>>> toggling back \csv would have forgotted the previous settings (just
>>>> like \H does, though, so would for instance reset to aligned with
>>>> |),
>>>> so it would not be perfect.
>>>>
>>>
>>> this doesn't solve usual format settings by \pset format csv
>>>
>>
> Indeed, it does not. Alas, "format" is not strictly about format but more
> about string escaping.
>
> (2) your proposal as I understand it:
>>>>
>>>> "\pset format csv" may or may not use the fieldsep, depending on
>>>> whether it was explicitely set, an information which is not shown,
>>>> i.e.:
>>>>
>>>>   \pset fieldsep # fieldsep separator is "|"
>>>>   \pset format csv # would output a,b,c or a|b|c...
>>>>
>>>> Because it depends on whether fieldsep was set explicitely to '|' or
>>>> whether it has this value but it was due to the default.
>>>>
>>>> This kind of unclear behavioral determinism does not seem desirable.
>>>>
>>>>
>>> please, check and test attached patch. It is very simply for usage - and
>>> there is not any unclear behave. Just you should to accept so formats can
>>> have own defaults for separators.
>>>
>>
> I checked, and the behavior is intrinsically strange.
>
> Your patch shows an empty '' fieldsep at startup, which is debatable
> because it is not its actual value.
>

it is correct. Default format is aligned, that doesn't use fieldsep.


>
>  psql> \pset # fieldsep '' -- doubtful
>
> ISTM that at the minimum it should show a special "" or
> whatever value, which creates some kind of exception because this special
> value cannot be set and special values are a bad thing.
>
> When a format is set, a current default fielsep is shown.
>
>  psql> \pset format unaligned
>  psql> \pset # fieldsep '|' -- new default shown
>  psql> SELECT 1 AS one, 2 AS two;
>one|two
>1|2
>
>  psql> \pset format csv
>  psql> \pset # fieldsep ',' -- new default shown
>
>  psql> SELECT 1 AS one, 2 AS two;
>one,two
>1,2
>
> However, if fieldsep is set once, the behaviors changes indefinitely:
>
>  psql> \pset fieldsep '|'
>  psql> \pset format unaligned
># unaligned with '|'
>
>  psql> \pset format csv
># csv with '|'
>
>
There should be similar mechanism like fieldsep_zero, that reset settings.
Some like fieldsep_default.


> That is once the fieldsep has been used, you are back to the v4 behavior
> that you are disagreeing with in the first place.
>
> The patch creates an incomplete state automaton which some unreachable
> states, because once a value is set there is no way to switch back to the
> previous "use default" behavior.
>

It is not implemented, but it is small problem


>
> Also, if I do "\pset fielsep ''" then the \pset output does not allow to
> distinguish between the unset state and set to empty state.
>

This is question - how to fieldsep be related to current format. Aligned
format doesn't use fieldsep.

The alternative can be words "not used"  -- and then you can see difference
between "" and other. But if you see the code, the empty string is used
like info about empty fieldsep now.


> I would not expect a committer to accept such a strange no-possible-return
> to previous state because of a hidden state (whether fieldsep has been set
> or not in the past) behavior.
>

This design is very similar to already implemented fieldsep_zero - look to
code. It is nothing new.


>
> So I do not think that this proposed version is really satisfactory.
>

So I can do better?

1. use special default string for formats that doesn't field sep - like
"not used" or some
2. I can implemet the option fieldsep_default - very similary to
fieldsep_zero to reset fieldsep to default state.

please, check updated patch

Regards

Pavel

>
> --
> Fabien.
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 10b97950ec..c984a9cbaa 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -151,6 +151,16 @@ EOF
   
 
 
+
+  --csv
+  
+  
+  Switches to csv output mode. This is equivalent to \pset format
+  csv followed by \pset fieldsep ',

Re: Re: csv format for psql

2018-03-24 Thread Fabien COELHO


Hello Pavel,


I'm suggesting to add \csv which would behave like \H to toggle CSV
mode so as to improve this situation, with a caveat which is that
toggling back \csv would have forgotted the previous settings (just
like \H does, though, so would for instance reset to aligned with |),
so it would not be perfect.


this doesn't solve usual format settings by \pset format csv


Indeed, it does not. Alas, "format" is not strictly about format but more 
about string escaping.



(2) your proposal as I understand it:

"\pset format csv" may or may not use the fieldsep, depending on
whether it was explicitely set, an information which is not shown,
i.e.:

  \pset fieldsep # fieldsep separator is "|"
  \pset format csv # would output a,b,c or a|b|c...

Because it depends on whether fieldsep was set explicitely to '|' or
whether it has this value but it was due to the default.

This kind of unclear behavioral determinism does not seem desirable.



please, check and test attached patch. It is very simply for usage - and
there is not any unclear behave. Just you should to accept so formats can
have own defaults for separators.


I checked, and the behavior is intrinsically strange.

Your patch shows an empty '' fieldsep at startup, which is debatable 
because it is not its actual value.


 psql> \pset # fieldsep '' -- doubtful

ISTM that at the minimum it should show a special "" or 
whatever value, which creates some kind of exception because this special 
value cannot be set and special values are a bad thing.


When a format is set, a current default fielsep is shown.

 psql> \pset format unaligned
 psql> \pset # fieldsep '|' -- new default shown
 psql> SELECT 1 AS one, 2 AS two;
   one|two
   1|2

 psql> \pset format csv
 psql> \pset # fieldsep ',' -- new default shown

 psql> SELECT 1 AS one, 2 AS two;
   one,two
   1,2

However, if fieldsep is set once, the behaviors changes indefinitely:

 psql> \pset fieldsep '|'
 psql> \pset format unaligned
   # unaligned with '|'

 psql> \pset format csv
   # csv with '|'

That is once the fieldsep has been used, you are back to the v4 behavior 
that you are disagreeing with in the first place.


The patch creates an incomplete state automaton which some unreachable 
states, because once a value is set there is no way to switch back to the 
previous "use default" behavior.


Also, if I do "\pset fielsep ''" then the \pset output does not allow to 
distinguish between the unset state and set to empty state.


I would not expect a committer to accept such a strange no-possible-return 
to previous state because of a hidden state (whether fieldsep has been set 
or not in the past) behavior.


So I do not think that this proposed version is really satisfactory.

--
Fabien.



Re: Re: csv format for psql

2018-03-24 Thread Pavel Stehule
minor fix
>

all is working now

[pavel@nemesis postgresql]$ psql  --csv -F ';' -c "table foo" postgres
a;b;c
3;4;Nazdar
3;4;Nazdar
[pavel@nemesis postgresql]$ psql  -F ';' --csv  -c "table foo" postgres
a;b;c
3;4;Nazdar
3;4;Nazdar
[pavel@nemesis postgresql]$ psql  --csv  -c "table foo" postgres
a,b,c
3,4,Nazdar
3,4,Nazdar
[pavel@nemesis postgresql]$ psql  -A  -c "table foo" postgres
a|b|c
3|4|Nazdar
3|4|Nazdar
(2 rows)
[pavel@nemesis postgresql]$ psql  -A -F , -c "table foo" postgres
a,b,c
3,4,Nazdar
3,4,Nazdar
(2 rows)

Regards

Pavel



>
> Regards
>
> Pavel
>
>
>>
>> Regards
>>
>> Pavel
>>
>>>
>>> --
>>> Fabien.
>>>
>>
>>
>


Re: Re: csv format for psql

2018-03-24 Thread Pavel Stehule
2018-03-24 8:24 GMT+01:00 Pavel Stehule :

>
>
> 2018-03-24 8:15 GMT+01:00 Fabien COELHO :
>
>>
>> Hello Pavel,
>>
>> The patch adds a simple way to generate csv output from "psql" queries,
 much simpler than playing around with COPY or \copy. It allows to
 generate
 a clean CSV dump from something as short as:

   sh> psql --csv -c 'TABLE foo' > foo.csv

 Documentation is clear.

 Test cover a significant number of cases (fieldsep, expanded,
 tuples-only).
 Although recordsep changes are not actually tested, it worked
 interactively
 and I think that tests are sufficient as is.

 There are somehow remaining point about which a committer/other people
 input would be nice:

 (1) There are some mild disagreement whether the fieldsep should be
 format specific shared with other format. I do not think that a specific
 fieldsep is worth it, but this is a marginal preference, and other people
 opinion differ. What is best is not obvious.

 Pavel also suggested to have a special handling based on whether the
 fieldsep is explicitely set or not. I'm not too keen on that because it
 departs significantly from the way psql formatting is currently handled,
 and what is happening becomes unclear to the user.

 (2) For interactive use, two commands are required: \pset format csv +
 \pset fieldsep ',' (or ';' or '\t' or whatever...). Maybe some \csv command
 similar to \H would be appropriate, or not, to set both values more
 efficiently. Could be something for another patch.

 Not sure what is the status of the patch if we do not have a clear
 consensus.

>>>
>>> I am sorry, but I don't think so this interface is good enough. Using |
>>> as
>>> default CSV separator is just wrong. It and only it is a problem. Any
>>> other
>>> is perfect.
>>>
>>
>> I do not think that there is a perfect solution, so some compromise will
>> be needed or we won't get it.
>>
>> (1) patch v4:
>>
>> "\pset format csv" retains the current fieldsep value, so fields are
>> separated by whatever is in the variable, which means that for getting
>> a standard csv two commands are needed, which is clearly documented,
>> but may be considered as surprising. ISTM that the underlying point is
>> that "format" is really about string escaping, not about the full
>> output
>> format, but this is a pre-existing situation.
>>
>> I'm suggesting to add \csv which would behave like \H to toggle CSV
>> mode so as to improve this situation, with a caveat which is that
>> toggling back \csv would have forgotted the previous settings (just
>> like \H does, though, so would for instance reset to aligned with |),
>> so it would not be perfect.
>>
>
> this doesn't solve usual format settings by \pset format csv
>
>
>>
>> (2) your proposal as I understand it:
>>
>> "\pset format csv" may or may not use the fieldsep, depending on
>> whether it was explicitely set, an information which is not shown,
>> i.e.:
>>
>>   \pset fieldsep # fieldsep separator is "|"
>>   \pset format csv # would output a,b,c or a|b|c...
>>
>> Because it depends on whether fieldsep was set explicitely to '|' or
>> whether it has this value but it was due to the default.
>>
>> This kind of unclear behavioral determinism does not seem desirable.
>>
>
> please, check and test attached patch. It is very simply for usage - and
> there is not any unclear behave. Just you should to accept so formats can
> have own defaults for separators.
>
>
>>
>> (3) other option, always use a comma:
>>
>> this was rejected because some people like their comma separated
>> values to be separated by semi-colons or tabs (aka tsv).
>>
>> (4) other option, Daniel v3 or v2:
>>
>> use a distinct "fieldsep_csv" variable initially set to ','. This adds
>> yet another specific variable that has to be remembered, some styles
>> would use fieldsep but csv would not so it is some kind of exception
>> that I would wish to avoid.
>>
>> My current preference order in the suggested solutions is 1, 4, 2, 3,
>> with a significant preference for 1.
>>
>
> I am thinking so @1 solves nothing - people are using \pset format ...
>
> @3 is clearly bad - there are not any discussion
>
> @4 can be compromise solution, but then there should be renamed fieldsep.
> Now, fieldsep is used just for unaligned format - for nothing else. If we
> introduce fieldsep_csv, then fieldsep should be renamed to
> fieldsep_unaligned. I can live with it.
>
> But I think so default fieldsep is better option. Please, try my patch and
> comment it.
>

minor fix

Regards

Pavel


>
> Regards
>
> Pavel
>
>>
>> --
>> Fabien.
>>
>
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 10b97950ec..c984a9cbaa 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ 

Re: Re: csv format for psql

2018-03-24 Thread Pavel Stehule
2018-03-24 8:15 GMT+01:00 Fabien COELHO :

>
> Hello Pavel,
>
> The patch adds a simple way to generate csv output from "psql" queries,
>>> much simpler than playing around with COPY or \copy. It allows to
>>> generate
>>> a clean CSV dump from something as short as:
>>>
>>>   sh> psql --csv -c 'TABLE foo' > foo.csv
>>>
>>> Documentation is clear.
>>>
>>> Test cover a significant number of cases (fieldsep, expanded,
>>> tuples-only).
>>> Although recordsep changes are not actually tested, it worked
>>> interactively
>>> and I think that tests are sufficient as is.
>>>
>>> There are somehow remaining point about which a committer/other people
>>> input would be nice:
>>>
>>> (1) There are some mild disagreement whether the fieldsep should be
>>> format specific shared with other format. I do not think that a specific
>>> fieldsep is worth it, but this is a marginal preference, and other people
>>> opinion differ. What is best is not obvious.
>>>
>>> Pavel also suggested to have a special handling based on whether the
>>> fieldsep is explicitely set or not. I'm not too keen on that because it
>>> departs significantly from the way psql formatting is currently handled,
>>> and what is happening becomes unclear to the user.
>>>
>>> (2) For interactive use, two commands are required: \pset format csv +
>>> \pset fieldsep ',' (or ';' or '\t' or whatever...). Maybe some \csv command
>>> similar to \H would be appropriate, or not, to set both values more
>>> efficiently. Could be something for another patch.
>>>
>>> Not sure what is the status of the patch if we do not have a clear
>>> consensus.
>>>
>>
>> I am sorry, but I don't think so this interface is good enough. Using | as
>> default CSV separator is just wrong. It and only it is a problem. Any
>> other
>> is perfect.
>>
>
> I do not think that there is a perfect solution, so some compromise will
> be needed or we won't get it.
>
> (1) patch v4:
>
> "\pset format csv" retains the current fieldsep value, so fields are
> separated by whatever is in the variable, which means that for getting
> a standard csv two commands are needed, which is clearly documented,
> but may be considered as surprising. ISTM that the underlying point is
> that "format" is really about string escaping, not about the full
> output
> format, but this is a pre-existing situation.
>
> I'm suggesting to add \csv which would behave like \H to toggle CSV
> mode so as to improve this situation, with a caveat which is that
> toggling back \csv would have forgotted the previous settings (just
> like \H does, though, so would for instance reset to aligned with |),
> so it would not be perfect.
>

this doesn't solve usual format settings by \pset format csv


>
> (2) your proposal as I understand it:
>
> "\pset format csv" may or may not use the fieldsep, depending on
> whether it was explicitely set, an information which is not shown,
> i.e.:
>
>   \pset fieldsep # fieldsep separator is "|"
>   \pset format csv # would output a,b,c or a|b|c...
>
> Because it depends on whether fieldsep was set explicitely to '|' or
> whether it has this value but it was due to the default.
>
> This kind of unclear behavioral determinism does not seem desirable.
>

please, check and test attached patch. It is very simply for usage - and
there is not any unclear behave. Just you should to accept so formats can
have own defaults for separators.


>
> (3) other option, always use a comma:
>
> this was rejected because some people like their comma separated
> values to be separated by semi-colons or tabs (aka tsv).
>
> (4) other option, Daniel v3 or v2:
>
> use a distinct "fieldsep_csv" variable initially set to ','. This adds
> yet another specific variable that has to be remembered, some styles
> would use fieldsep but csv would not so it is some kind of exception
> that I would wish to avoid.
>
> My current preference order in the suggested solutions is 1, 4, 2, 3, with
> a significant preference for 1.
>

I am thinking so @1 solves nothing - people are using \pset format ...

@3 is clearly bad - there are not any discussion

@4 can be compromise solution, but then there should be renamed fieldsep.
Now, fieldsep is used just for unaligned format - for nothing else. If we
introduce fieldsep_csv, then fieldsep should be renamed to
fieldsep_unaligned. I can live with it.

But I think so default fieldsep is better option. Please, try my patch and
comment it.

Regards

Pavel

>
> --
> Fabien.
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 10b97950ec..c984a9cbaa 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -151,6 +151,16 @@ EOF
   
 
 
+
+  --csv
+  
+  
+  Switches to csv output mode. This is equivalent to \pset format
+  csv followed by \pset fieldsep ','.
+  
+  
+
+
 
   

Re: Re: csv format for psql

2018-03-24 Thread Fabien COELHO


Hello Pavel,


The patch adds a simple way to generate csv output from "psql" queries,
much simpler than playing around with COPY or \copy. It allows to generate
a clean CSV dump from something as short as:

  sh> psql --csv -c 'TABLE foo' > foo.csv

Documentation is clear.

Test cover a significant number of cases (fieldsep, expanded, tuples-only).
Although recordsep changes are not actually tested, it worked interactively
and I think that tests are sufficient as is.

There are somehow remaining point about which a committer/other people 
input would be nice:


(1) There are some mild disagreement whether the fieldsep should be 
format specific shared with other format. I do not think that a 
specific fieldsep is worth it, but this is a marginal preference, and 
other people opinion differ. What is best is not obvious.


Pavel also suggested to have a special handling based on whether the 
fieldsep is explicitely set or not. I'm not too keen on that because it 
departs significantly from the way psql formatting is currently 
handled, and what is happening becomes unclear to the user.


(2) For interactive use, two commands are required: \pset format csv + 
\pset fieldsep ',' (or ';' or '\t' or whatever...). Maybe some \csv 
command similar to \H would be appropriate, or not, to set both values 
more efficiently. Could be something for another patch.


Not sure what is the status of the patch if we do not have a clear
consensus.


I am sorry, but I don't think so this interface is good enough. Using | as
default CSV separator is just wrong. It and only it is a problem. Any other
is perfect.


I do not think that there is a perfect solution, so some compromise will 
be needed or we won't get it.


(1) patch v4:

"\pset format csv" retains the current fieldsep value, so fields are
separated by whatever is in the variable, which means that for getting
a standard csv two commands are needed, which is clearly documented,
but may be considered as surprising. ISTM that the underlying point is
that "format" is really about string escaping, not about the full output
format, but this is a pre-existing situation.

I'm suggesting to add \csv which would behave like \H to toggle CSV
mode so as to improve this situation, with a caveat which is that
toggling back \csv would have forgotted the previous settings (just
like \H does, though, so would for instance reset to aligned with |),
so it would not be perfect.

(2) your proposal as I understand it:

"\pset format csv" may or may not use the fieldsep, depending on
whether it was explicitely set, an information which is not shown, i.e.:

  \pset fieldsep # fieldsep separator is "|"
  \pset format csv # would output a,b,c or a|b|c...

Because it depends on whether fieldsep was set explicitely to '|' or
whether it has this value but it was due to the default.

This kind of unclear behavioral determinism does not seem desirable.

(3) other option, always use a comma:

this was rejected because some people like their comma separated
values to be separated by semi-colons or tabs (aka tsv).

(4) other option, Daniel v3 or v2:

use a distinct "fieldsep_csv" variable initially set to ','. This adds
yet another specific variable that has to be remembered, some styles
would use fieldsep but csv would not so it is some kind of exception
that I would wish to avoid.

My current preference order in the suggested solutions is 1, 4, 2, 3, with 
a significant preference for 1.


--
Fabien.



Re: Re: csv format for psql

2018-03-23 Thread Pavel Stehule
2018-03-23 18:55 GMT+01:00 Fabien COELHO :

>
> Hello Daniel,
>
> Do you know when you'll have an updated patch that addresses the minor
>>> issues brought up in review and the concern above?
>>>
>>
>> Here's an update addressing the issues discussed:
>>
>> - fieldsep and recordsep are used, no more fieldsep_csv
>> - the command line option is --csv without short option and is equivalent
>> to -P format=csv -P fieldsep=','
>> - pset output formats are reordered alphabetically on display
>> - a couple more cases in the regression tests
>>
>
> Patch applies cleanly, compiles, doc gen ok, "make check" ok.
>
> The patch adds a simple way to generate csv output from "psql" queries,
> much simpler than playing around with COPY or \copy. It allows to generate
> a clean CSV dump from something as short as:
>
>   sh> psql --csv -c 'TABLE foo' > foo.csv
>
> Documentation is clear.
>
> Test cover a significant number of cases (fieldsep, expanded, tuples-only).
> Although recordsep changes are not actually tested, it worked interactively
> and I think that tests are sufficient as is.
>
> There are somehow remaining point about which a committer/other people
> input
> would be nice:
>
> (1) There are some mild disagreement whether the fieldsep should be format
> specific shared with other format. I do not think that a specific
> fieldsep
> is worth it, but this is a marginal preference, and other people
> opinion
> differ. What is best is not obvious.
>
> Pavel also suggested to have a special handling based on whether
> the fieldsep is explicitely set or not. I'm not too keen on that
> because
> it departs significantly from the way psql formatting is currently
> handled, and what is happening becomes unclear to the user.
>
> (2) For interactive use, two commands are required: \pset format csv +
> \pset fieldsep ',' (or ';' or '\t' or whatever...). Maybe some \csv
> command similar  to \H would be appropriate, or not, to set both values
> more efficiently. Could be something for another patch.
>
> Not sure what is the status of the patch if we do not have a clear
> consensus.
>

I am sorry, but I don't think so this interface is good enough. Using | as
default CSV separator is just wrong. It and only it is a problem. Any other
is perfect.

Regards

Pavel


>
> --
> Fabien.
>
>


Re: Re: csv format for psql

2018-03-23 Thread Pavel Stehule
2018-03-23 12:59 GMT+01:00 Daniel Verite :

> Pavel Stehule wrote:
>
> > It should not be hard. All formats can has '|' like now, and csv can
> have a
> > ',' - then if field separator is not explicit, then default field
> separator
> > is used, else specified field separator is used.
> >
> > You can see my idea in attached patch
>
> With that patch, consider this sequence:
>
> postgres=# \pset format csv
> Output format is csv.
> postgres=# \pset fieldsep
> Field separator is "|".
> postgres=# select 1 as a,2 as b;
> a,b
> 1,2
>
> Here psql reports that fieldsep is "|" and right away is using something
> else in the output. That doesn't look good.
>

yes - my patch was proof concept - nothing more.

But this can be simply solved - if we have a table of default field
separator, then if separator is not explicit, then default for used format
is printed.

>
> You may object that it's fixable by tweaking the output of \pset,
> \pset fieldsep, and \? variables so that it knows that the current
> output format is going to use a "hidden" default separator, and
> then these commands should display that value instead.
> But that'd be somewhat playing whack-a-mole, as the following
> sequence would now be possible, with '|' being used  as
> the separator instead of the ',' reported just above:
>
> postgres=# \pset format csv
> Output format is csv.
> postgres=# \pset fieldsep
> Field separator is ",".
> postgres=# \a
> Output format is aligned.
> postgres=# select 1 as a,2 as b;
>  a | b
> ---+---
>  1 | 2
>

I am sorry, but path that I sent was just proof concept - I didn't
implement defaults for any other related formats.

I'll try to send cleaner patch tomorrow.

Regards

Pavel


>
>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>


Re: Re: csv format for psql

2018-03-23 Thread Fabien COELHO


Hello Daniel,


Do you know when you'll have an updated patch that addresses the minor
issues brought up in review and the concern above?


Here's an update addressing the issues discussed:

- fieldsep and recordsep are used, no more fieldsep_csv
- the command line option is --csv without short option and is equivalent
to -P format=csv -P fieldsep=','
- pset output formats are reordered alphabetically on display
- a couple more cases in the regression tests


Patch applies cleanly, compiles, doc gen ok, "make check" ok.

The patch adds a simple way to generate csv output from "psql" queries, 
much simpler than playing around with COPY or \copy. It allows to generate 
a clean CSV dump from something as short as:


  sh> psql --csv -c 'TABLE foo' > foo.csv

Documentation is clear.

Test cover a significant number of cases (fieldsep, expanded, tuples-only).
Although recordsep changes are not actually tested, it worked interactively
and I think that tests are sufficient as is.

There are somehow remaining point about which a committer/other people input
would be nice:

(1) There are some mild disagreement whether the fieldsep should be format
specific shared with other format. I do not think that a specific fieldsep
is worth it, but this is a marginal preference, and other people opinion
differ. What is best is not obvious.

Pavel also suggested to have a special handling based on whether
the fieldsep is explicitely set or not. I'm not too keen on that because
it departs significantly from the way psql formatting is currently
handled, and what is happening becomes unclear to the user.

(2) For interactive use, two commands are required: \pset format csv +
\pset fieldsep ',' (or ';' or '\t' or whatever...). Maybe some \csv
command similar  to \H would be appropriate, or not, to set both values
more efficiently. Could be something for another patch.

Not sure what is the status of the patch if we do not have a clear 
consensus.


--
Fabien.



Re: Re: csv format for psql

2018-03-23 Thread Daniel Verite
Pavel Stehule wrote:

> It should not be hard. All formats can has '|' like now, and csv can have a
> ',' - then if field separator is not explicit, then default field separator
> is used, else specified field separator is used.
> 
> You can see my idea in attached patch

With that patch, consider this sequence:

postgres=# \pset format csv
Output format is csv.
postgres=# \pset fieldsep
Field separator is "|".
postgres=# select 1 as a,2 as b;
a,b
1,2

Here psql reports that fieldsep is "|" and right away is using something
else in the output. That doesn't look good.

You may object that it's fixable by tweaking the output of \pset,
\pset fieldsep, and \? variables so that it knows that the current
output format is going to use a "hidden" default separator, and
then these commands should display that value instead.
But that'd be somewhat playing whack-a-mole, as the following
sequence would now be possible, with '|' being used  as
the separator instead of the ',' reported just above:

postgres=# \pset format csv
Output format is csv.
postgres=# \pset fieldsep
Field separator is ",".
postgres=# \a
Output format is aligned.
postgres=# select 1 as a,2 as b;
 a | b 
---+---
 1 | 2


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: Re: csv format for psql

2018-03-22 Thread Pavel Stehule
2018-03-22 20:10 GMT+01:00 Pavel Stehule :

>
>
> 2018-03-22 19:28 GMT+01:00 Pavel Stehule :
>
>>
>>
>> 2018-03-22 18:38 GMT+01:00 Fabien COELHO :
>>
>>>
>>> Hello Pavel,
>>>
>>> Using \pset format csv means overwriting field sep every time - nobody
 uses
 |

>>>
>>> Yep. The alternative is to have a csv-specific separator variable, which
>>> does not seem very useful, must be remembered, but this is indeed debatable.
>>>
>>> I think so dependency on order of psql arguments is significant problem

>>>
>>> This is intentional, and this issue/feature already exists, the last
>>> argument overwrite previous settings thus will win, eg:
>>>
>>>   psql --pset=format=troff --html -c 'SELECT 1'
>>>
>>> Will output in html, not in troff.
>>>
>>
>> Can we introduce some format specific default separators - if we would
>> not to introduce csv_field_sep options?
>>
>> It should not be hard. All formats can has '|' like now, and csv can have
>> a ',' - then if field separator is not explicit, then default field
>> separator is used, else specified field separator is used.
>>
>> You can see my idea in attached patch
>>
>> Regards
>>
>> Pavel
>>
>> postgres=# \pset format csv
>> Output format is csv.
>> postgres=# select * from foo;
>> a,b,c
>> 1,2,Hello
>> 3,4,Nazdar
>> postgres=# \pset fieldsep ;
>> Field separator is ";".
>> postgres=# select * from foo;
>> a;b;c
>> 1;2;Hello
>> 3;4;Nazdar
>>
>>
>>
> The default fieldsep should be "off" that means so format defaults are
> used. ',' is used for CSV, | for any else.
>
> So all will work like now, but there will be bigger freedom with new
> format design. Now, all formats with possibility to setting fieldsep,
> should to share default '|', what I think, is not practical.
>

I can make patch, if there will be a agreement.

comments?

Regards

Pavel

>
>
>
>
>
>>
>>
>>
>>
>>>
>>> --
>>> Fabien.
>>>
>>
>>
>


Re: Re: csv format for psql

2018-03-22 Thread Pavel Stehule
2018-03-22 19:28 GMT+01:00 Pavel Stehule :

>
>
> 2018-03-22 18:38 GMT+01:00 Fabien COELHO :
>
>>
>> Hello Pavel,
>>
>> Using \pset format csv means overwriting field sep every time - nobody
>>> uses
>>> |
>>>
>>
>> Yep. The alternative is to have a csv-specific separator variable, which
>> does not seem very useful, must be remembered, but this is indeed debatable.
>>
>> I think so dependency on order of psql arguments is significant problem
>>>
>>
>> This is intentional, and this issue/feature already exists, the last
>> argument overwrite previous settings thus will win, eg:
>>
>>   psql --pset=format=troff --html -c 'SELECT 1'
>>
>> Will output in html, not in troff.
>>
>
> Can we introduce some format specific default separators - if we would not
> to introduce csv_field_sep options?
>
> It should not be hard. All formats can has '|' like now, and csv can have
> a ',' - then if field separator is not explicit, then default field
> separator is used, else specified field separator is used.
>
> You can see my idea in attached patch
>
> Regards
>
> Pavel
>
> postgres=# \pset format csv
> Output format is csv.
> postgres=# select * from foo;
> a,b,c
> 1,2,Hello
> 3,4,Nazdar
> postgres=# \pset fieldsep ;
> Field separator is ";".
> postgres=# select * from foo;
> a;b;c
> 1;2;Hello
> 3;4;Nazdar
>
>
>
The default fieldsep should be "off" that means so format defaults are
used. ',' is used for CSV, | for any else.

So all will work like now, but there will be bigger freedom with new format
design. Now, all formats with possibility to setting fieldsep, should to
share default '|', what I think, is not practical.





>
>
>
>
>>
>> --
>> Fabien.
>>
>
>


Re: Re: csv format for psql

2018-03-22 Thread Pavel Stehule
2018-03-22 18:38 GMT+01:00 Fabien COELHO :

>
> Hello Pavel,
>
> Using \pset format csv means overwriting field sep every time - nobody uses
>> |
>>
>
> Yep. The alternative is to have a csv-specific separator variable, which
> does not seem very useful, must be remembered, but this is indeed debatable.
>
> I think so dependency on order of psql arguments is significant problem
>>
>
> This is intentional, and this issue/feature already exists, the last
> argument overwrite previous settings thus will win, eg:
>
>   psql --pset=format=troff --html -c 'SELECT 1'
>
> Will output in html, not in troff.
>

Can we introduce some format specific default separators - if we would not
to introduce csv_field_sep options?

It should not be hard. All formats can has '|' like now, and csv can have a
',' - then if field separator is not explicit, then default field separator
is used, else specified field separator is used.

You can see my idea in attached patch

Regards

Pavel

postgres=# \pset format csv
Output format is csv.
postgres=# select * from foo;
a,b,c
1,2,Hello
3,4,Nazdar
postgres=# \pset fieldsep ;
Field separator is ";".
postgres=# select * from foo;
a;b;c
1;2;Hello
3;4;Nazdar






>
> --
> Fabien.
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index bfdf859731..4d3e3b59f3 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -151,6 +151,16 @@ EOF
   
 
 
+
+  --csv
+  
+  
+  Switches to csv output mode. This is equivalent to \pset format
+  csv followed by \pset fieldsep ','.
+  
+  
+
+
 
   -d dbname
   --dbname=dbname
@@ -246,7 +256,7 @@ EOF
   
   
   Use separator as the
-  field separator for unaligned output. This is equivalent to
+  field separator for unaligned and csv outputs. This is equivalent to
   \pset fieldsep or \f.
   
   
@@ -382,7 +392,7 @@ EOF
   
   
   Use separator as the
-  record separator for unaligned output. This is equivalent to
+  record separator for unaligned and csv outputs. This is equivalent to
   \pset recordsep.
   
   
@@ -558,7 +568,7 @@ EOF
   
   
   Set the field separator for unaligned output to a zero byte.  This is
-  equvalent to \pset fieldsep_zero.
+  equivalent to \pset fieldsep_zero.
   
   
 
@@ -1937,9 +1947,9 @@ Tue Oct 26 21:40:57 CEST 1999
 
 
 
-Sets the field separator for unaligned query output. The default
-is the vertical bar (|). It is equivalent to
-\pset fieldsep.
+Sets the field separator for unaligned and csv query outputs. The
+default is the vertical bar (|). It is equivalent
+to \pset fieldsep.
 
 
   
@@ -2546,8 +2556,8 @@ lo_import 152801
   fieldsep
   
   
-  Specifies the field separator to be used in unaligned output
-  format. That way one can create, for example, tab- or
+  Specifies the field separator to be used in unaligned and csv output
+  formats. That way one can create, for example, tab- or
   comma-separated output, which other programs might prefer. To
   set a tab as field separator, type \pset fieldsep
   '\t'. The default field separator is
@@ -2584,9 +2594,13 @@ lo_import 152801
   format
   
   
-  Sets the output format to one of unaligned,
-  aligned, wrapped,
-  html, asciidoc,
+  Sets the output format to one of
+  unaligned,
+  aligned,
+  csv,
+  wrapped,
+  html,
+  asciidoc,
   latex (uses tabular),
   latex-longtable, or
   troff-ms.
@@ -2601,6 +2615,15 @@ lo_import 152801
   format).
   
 
+  csv format writes columns separated
+  by fieldsep, applying the CSV quoting rules
+  described in RFC-4180 and compatible with the CSV format
+  of the COPY command.
+  The header with column names is output unless the
+  tuples_only parameter is on.
+  Title and footers are not printed.
+  
+
   aligned format is the standard, human-readable,
   nicely formatted text output;  this is the default.
   
@@ -2747,8 +2770,8 @@ lo_import 152801
   recordsep
   
   
-  Specifies the record (line) separator to use in unaligned
-  output format. The default is a newline character.
+  Specifies the record (line) separator to use in unaligned or
+  csv output formats. The default is a newline character.
   
   
   
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 3560318749..1cd8a3856e 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1222,6 +1222,10 @@ exec_command_f(PsqlScanState 

Re: Re: csv format for psql

2018-03-22 Thread Fabien COELHO


Hello Pavel,


Using \pset format csv means overwriting field sep every time - nobody uses
|


Yep. The alternative is to have a csv-specific separator variable, which 
does not seem very useful, must be remembered, but this is indeed 
debatable.



I think so dependency on order of psql arguments is significant problem


This is intentional, and this issue/feature already exists, the last 
argument overwrite previous settings thus will win, eg:


  psql --pset=format=troff --html -c 'SELECT 1'

Will output in html, not in troff.

--
Fabien.



Re: Re: csv format for psql

2018-03-22 Thread Pavel Stehule
Hi

2018-03-22 16:30 GMT+01:00 Daniel Verite :

> David Steele wrote:
>
> > Do you know when you'll have an updated patch that addresses the minor
> > issues brought up in review and the concern above?
>
> Here's an update addressing the issues discussed:
>
> - fieldsep and recordsep are used, no more fieldsep_csv
> - the command line option is --csv without short option and is equivalent
>  to -P format=csv -P fieldsep=','
> - pset output formats are reordered alphabetically on display
> - a couple more cases in the regression tests
>
>
I am sorry, I don't think so this design is correct. It introduce
dependency on arguments order

# correct output
[pavel@nemesis postgresql]$ psql --csv -c "table foo" -F ';'
a;b;c
1;2;Hello
3;4;Nazdar

# error -F is ignored
[pavel@nemesis postgresql]$ psql -F ';' --csv -c "table foo"
a,b,c
1,2,Hello
3,4,Nazdar

Using \pset format csv means overwriting field sep every time - nobody uses
|

I think so dependency on order of psql arguments is significant problem

Regards

Pavel





>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>


Re: Re: csv format for psql

2018-03-22 Thread Daniel Verite
David Steele wrote:

> Do you know when you'll have an updated patch that addresses the minor
> issues brought up in review and the concern above?

Here's an update addressing the issues discussed:

- fieldsep and recordsep are used, no more fieldsep_csv
- the command line option is --csv without short option and is equivalent
 to -P format=csv -P fieldsep=','
- pset output formats are reordered alphabetically on display
- a couple more cases in the regression tests


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 10b9795..c984a9c 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -152,6 +152,16 @@ EOF
 
 
 
+  --csv
+  
+  
+  Switches to csv output mode. This is equivalent to \pset format
+  csv followed by \pset fieldsep ','.
+  
+  
+
+
+
   -d dbname
   --dbname=dbname
   
@@ -246,7 +256,7 @@ EOF
   
   
   Use separator as the
-  field separator for unaligned output. This is equivalent to
+  field separator for unaligned and csv outputs. This is equivalent to
   \pset fieldsep or \f.
   
   
@@ -382,7 +392,7 @@ EOF
   
   
   Use separator as the
-  record separator for unaligned output. This is equivalent to
+  record separator for unaligned and csv outputs. This is equivalent to
   \pset recordsep.
   
   
@@ -558,7 +568,7 @@ EOF
   
   
   Set the field separator for unaligned output to a zero byte.  This is
-  equvalent to \pset fieldsep_zero.
+  equivalent to \pset fieldsep_zero.
   
   
 
@@ -1937,9 +1947,9 @@ Tue Oct 26 21:40:57 CEST 1999
 
 
 
-Sets the field separator for unaligned query output. The default
-is the vertical bar (|). It is equivalent to
-\pset fieldsep.
+Sets the field separator for unaligned and csv query outputs. The
+default is the vertical bar (|). It is equivalent
+to \pset fieldsep.
 
 
   
@@ -2546,8 +2556,8 @@ lo_import 152801
   fieldsep
   
   
-  Specifies the field separator to be used in unaligned output
-  format. That way one can create, for example, tab- or
+  Specifies the field separator to be used in unaligned and csv output
+  formats. That way one can create, for example, tab- or
   comma-separated output, which other programs might prefer. To
   set a tab as field separator, type \pset fieldsep
   '\t'. The default field separator is
@@ -2584,9 +2594,13 @@ lo_import 152801
   format
   
   
-  Sets the output format to one of unaligned,
-  aligned, wrapped,
-  html, asciidoc,
+  Sets the output format to one of
+  unaligned,
+  aligned,
+  csv,
+  wrapped,
+  html,
+  asciidoc,
   latex (uses tabular),
   latex-longtable, or
   troff-ms.
@@ -2601,6 +2615,15 @@ lo_import 152801
   format).
   
 
+  csv format writes columns separated
+  by fieldsep, applying the CSV quoting rules
+  described in RFC-4180 and compatible with the CSV format
+  of the COPY command.
+  The header with column names is output unless the
+  tuples_only parameter is on.
+  Title and footers are not printed.
+  
+
   aligned format is the standard, 
human-readable,
   nicely formatted text output;  this is the default.
   
@@ -2747,8 +2770,8 @@ lo_import 152801
   recordsep
   
   
-  Specifies the record (line) separator to use in unaligned
-  output format. The default is a newline character.
+  Specifies the record (line) separator to use in unaligned or
+  csv output formats. The default is a newline character.
   
   
   
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 3560318..1d8cc96 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3603,6 +3603,9 @@ _align2string(enum printFormat in)
case PRINT_TROFF_MS:
return "troff-ms";
break;
+   case PRINT_CSV:
+   return "csv";
+   break;
}
return "unknown";
 }
@@ -3658,25 +3661,27 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
{
if (!value)
;
-   else if (pg_strncasecmp("unaligned", value, vallen) == 0)
-   popt->topt.format = PRINT_UNALIGNED;
else if (pg_strncasecmp("aligned", value, vallen) == 0)

Re: Re: csv format for psql

2018-03-21 Thread David Steele
Hi Daniel,

On 3/10/18 1:49 AM, Fabien COELHO wrote:
> 
>>> recordsep in the unaligned mode doesn't play the role of a line ending
>>> because the last line is not finished by recordsep. According to the
>>> source
>>> code, this is intended, see print_unaligned_text() in print.c:
>>
>> Something else comes to mind: CSV allows linefeeds inside fields, and
>> we don't want to replace these with record separators.
> 
> Sure.
> 
>> So the notion that recordsep can be used to choose line endings
>> is even less okay than if there was just the last line issue.
> 
> I'm not following. ISTM that the escaping macanism would work in pretty
> all reasonable cases, although it might be possible to shot oneself in
> the foot by setting manually strange values for recordsep, eg '"'. I do
> not see that as a significant issue. If the user asks for something
> stupid, they get something stupid, fine.

This patch looks like it is getting close but there has been no update
in a while.

Do you know when you'll have an updated patch that addresses the minor
issues brought up in review and the concern above?

Regards,
-- 
-David
da...@pgmasters.net



Re: csv format for psql

2018-03-09 Thread Fabien COELHO



recordsep in the unaligned mode doesn't play the role of a line ending
because the last line is not finished by recordsep. According to the source
code, this is intended, see print_unaligned_text() in print.c:


Something else comes to mind: CSV allows linefeeds inside fields, and
we don't want to replace these with record separators.


Sure.


So the notion that recordsep can be used to choose line endings
is even less okay than if there was just the last line issue.


I'm not following. ISTM that the escaping macanism would work in pretty 
all reasonable cases, although it might be possible to shot oneself in the 
foot by setting manually strange values for recordsep, eg '"'. I do not 
see that as a significant issue. If the user asks for something stupid, 
they get something stupid, fine.


--
Fabien.



Re: csv format for psql

2018-03-09 Thread Fabien COELHO


Hello Daniel,

About "-C", I'm fine if it is used and if it is not used, really. psql 
begins to be quite full of one letter options, currently 34 of them, with 
upper & lower cases and numbers included.




The solution to set fieldsep automatically to ',' with
\pset format csv is problematic.


I agree. I'm really advocating that --csv would set fieldsep, but manual 
pset on format would still do what is expected, and only that, i.e. --csv 
is NOT a simple shortcut for  -P format=csv".



Same problem on the command line. Options are evaluated left-to-right:

$ psql --csv -F';'
would work as expected, but
$ psql -F';' --csv
would not.


ISTM that having an option overriding another one after it is standard 
practice.


I would be fine with that if --csv is documented as "setting format, 
fieldsep and recordsep to default suited for outputting CSV".


Now this is just a personal opinion.


The "\n" eol style is hardcoded. Should it use "recordsep"? For instance,
https://tools.ietf.org/html/rfc4180 seems to specify CRLF end of lines.
The definition is evolving, eg https://www.w3.org/TR/tabular-data-model/
accepts both "\r" and "\r\n". I do not like using windows eol, but I think
that it should be possible to do it, which is not the case with this
version.


Interesting point. The output stream is opened in text mode so printing
'\n' should generate LF on Unix, CR LF on Windows, and I think CR on MacOS.
I think that's for the best.


I did not know that C's putc/printf/... would change output on sight on 
different systems. I'm not sure I like it. It would still mean that one 
cannot change the format to suits \r\n or \n at will, which is kind of 
disappointing.



recordsep in the unaligned mode doesn't play the role of a line ending
because the last line is not finished by recordsep.


It could just be with CSV format? As you point out, there is already an 
exception with the separator is '\0'. Note that the last line of a CSV 
file may or may not end with \n or \r\n, according to specs.



I'd suggest that tests should include more types, not just strings. I
would suggest int, float, timestamp, bytea, an array (which uses , as a
separator), json (which uses both " and ,)...


I'll do but the printout code is type-agnostic so it's not supposed
to make a difference compared to mere literals.


Sure, but it seems better to actually see that it works properly for non 
trivial cases.



Cases with NULLs are missing though, I'll go add some too.


Indeed.

--
Fabien.



Re: csv format for psql

2018-03-09 Thread David G. Johnston
On Fri, Mar 9, 2018 at 3:18 PM, Daniel Verite 
wrote:

> I think that the point of recordsep in unaligned mode is you can set it
>
to something that never appears in the data, especially when embedded
> newlines might be in the data. In CSV this is solved differently so
> we don't need it.


​I'd rather argue it from the standpoint that \copy doesn't use recordsep
nor fieldsep and thus neither should --csv; which is arguably a convenience
invocation of \copy that pipes to psql's stdout (and overcomes \copy's
single-line limitation - which I think still exists... - and inability to
use variables - does it?...).  COPY doesn't allow for changing the record
separator and the newline output is system-dependent.  I can accept the
same limitation with this feature.

I suppose the question is how many "COPY" options do we want to expose on
the command line, and how does it look?

I'll put a -1 on having a short option (-C or otherwise); "that is the way
its always been done" doesn't work for me here - by way of example "-a and
-A" is ill-advised; --echo-all does not seem important enough to warrant a
short option (especially not a lower-case one) and so the more useful
unaligned mode is forced into the secondary capital A position.

David J.


  1   2   >