Re: [HACKERS] psql NUL record and field separator

2012-02-08 Thread Abhijit Menon-Sen
At 2012-02-07 13:20:43 +0200, pete...@gmx.net wrote:

 Should we rename the options and/or add that to the documentation, or is
 the new behavior obvious and any new terminology would be too confusing?

I agree there is potential for confusion either way. I tried to come up
with a complete and not-confusing wording for all four options, but did
not manage to improve on the current state of affairs significantly. I
think it can stay the way it is. The reference to xargs -0 is probably
enough to set the right expectations about how it works.

We can always add a sentence later to clarify the special-case behaviour
of -0 if anyone is actually confused (and the best wording will be more
clear in that situation too).

-- Abhijit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql NUL record and field separator

2012-02-07 Thread Peter Eisentraut
On tor, 2012-01-26 at 19:00 +0530, Abhijit Menon-Sen wrote:
 At issue are (at least) these three lines from print_unaligned_text in
 src/bin/psql/print.c:
 
  358 /* the last record needs to be concluded with a newline
 */
  359 if (need_recordsep)
  360 fputc('\n', fout);
 
 Perhaps the right thing to do would be to change this to output \0 if
 --record-separator-zero was used (but leave it at \n otherwise)? That
 is what my second attached patch does:
 
 $ bin/psql --record-separator-zero --field-separator-zero -At -c
 'select 1,2 union select 3,4'|xargs -0 echo
 1 2 3 4
 
 Thoughts?
 
  I think the most common use of this would be to set the record
  separator from the command line, so we could use a short option
  such as -0 or -z for that.
 
 I agree. The current option names are very unwieldy to type.
 
I have incorporated your two patches and added short options.  Updated
patch attached.

This made me wonder, however.  The existing -F and -R options set the
record *separator*.  The new options, however, set the record
*terminator*.  This is the small distinction that you had discovered.

Should we rename the options and/or add that to the documentation, or is
the new behavior obvious and any new terminology would be too confusing?
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index a9b1ed2..55aa5f2 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -482,6 +482,27 @@ PostgreSQL documentation
   /listitem
 /varlistentry
 
+varlistentry
+  termoption-z/option/term
+  termoption--field-separator-zero/option/term
+  listitem
+  para
+  Set the field separator for unaligned output to a zero byte.
+  /para
+  /listitem
+/varlistentry
+
+varlistentry
+  termoption-0/option/term
+  termoption--record-separator-zero/option/term
+  listitem
+  para
+  Set the record separator for unaligned output to a zero byte.  This is
+  useful for interfacing, for example, with literalxargs -0/literal.
+  /para
+  /listitem
+/varlistentry
+
  varlistentry
   termoption-1/option/term
   termoption--single-transaction/option/term
@@ -1909,6 +1930,16 @@ lo_import 152801
   /varlistentry
 
   varlistentry
+  termliteralfieldsep_zero/literal/term
+  listitem
+  para
+  Sets the field separator to use in unaligned output format to a zero
+  byte.
+  /para
+  /listitem
+  /varlistentry
+
+  varlistentry
   termliteralfooter/literal/term
   listitem
   para
@@ -2078,6 +2109,16 @@ lo_import 152801
   /varlistentry
 
   varlistentry
+  termliteralrecordsep_zero/literal/term
+  listitem
+  para
+  Sets the record separator to use in unaligned output format to a zero
+  byte.
+  /para
+  /listitem
+  /varlistentry
+
+  varlistentry
   termliteraltableattr/literal (or literalT/literal)/term
   listitem
   para
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index ab809ec..8421ad0 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -2272,11 +2272,26 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 	{
 		if (value)
 		{
-			free(popt-topt.fieldSep);
-			popt-topt.fieldSep = pg_strdup(value);
+			free(popt-topt.fieldSep.separator);
+			popt-topt.fieldSep.separator = pg_strdup(value);
+			popt-topt.fieldSep.separator_zero = false;
 		}
 		if (!quiet)
-			printf(_(Field separator is \%s\.\n), popt-topt.fieldSep);
+		{
+			if (popt-topt.fieldSep.separator_zero)
+printf(_(Field separator is zero byte.\n));
+			else
+printf(_(Field separator is \%s\.\n), popt-topt.fieldSep.separator);
+		}
+	}
+
+	else if (strcmp(param, fieldsep_zero) == 0)
+	{
+		free(popt-topt.fieldSep.separator);
+		popt-topt.fieldSep.separator = NULL;
+		popt-topt.fieldSep.separator_zero = true;
+		if (!quiet)
+			printf(_(Field separator is zero byte.\n));
 	}
 
 	/* record separator for unaligned text */
@@ -2284,18 +2299,30 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 	{
 		if (value)
 		{
-			free(popt-topt.recordSep);
-			popt-topt.recordSep = pg_strdup(value);
+			free(popt-topt.recordSep.separator);
+			popt-topt.recordSep.separator = pg_strdup(value);
+			popt-topt.recordSep.separator_zero = false;
 		}
 		if (!quiet)
 		{
-			if (strcmp(popt-topt.recordSep, \n) == 0)
+			if (popt-topt.recordSep.separator_zero)
+printf(_(Record separator is zero byte.\n));
+			else if (strcmp(popt-topt.recordSep.separator, \n) == 0)
 printf(_(Record separator is newline.));
 			else
-printf(_(Record separator is \%s\.\n), popt-topt.recordSep);
+printf(_(Record separator is \%s\.\n), popt-topt.recordSep.separator);
 		}
 	}
 
+	else if (strcmp(param, 

Re: [HACKERS] psql NUL record and field separator

2012-01-26 Thread Abhijit Menon-Sen
At 2012-01-14 14:23:49 +0200, pete...@gmx.net wrote:

 Inspired by this question http://stackoverflow.com/questions/6857265 I
 have implemented a way to set the psql record and field separators to
 a zero byte (ASCII NUL character).

Since this patch is in the commitfest, I had a look at it.

I agree that the feature is useful. The patch applies and builds cleanly
with HEAD@9f9135d1, but needs a further minor tweak to work (attached).
Without it, both zero separators get overwritten with the default value
after option parsing. The code looks good otherwise.

There's one problem:

 psql --record-separator-zero -At -c 'select something from somewhere' | xargs 
 -0 dosomething

If you run find -print0 and it finds one file, it will still print
filename\0, and xargs -0 will work fine.

But psql --record-separator-zero -At -c 'select 1' will print 1\n, not
1\0 or even 1\0\n, so xargs -0 will use the value 1\n, not 1. If
you're doing this in a shell script, handing the last argument specially
would be painful.

At issue are (at least) these three lines from print_unaligned_text in
src/bin/psql/print.c:

 358 /* the last record needs to be concluded with a newline */
 359 if (need_recordsep)
 360 fputc('\n', fout);

Perhaps the right thing to do would be to change this to output \0 if
--record-separator-zero was used (but leave it at \n otherwise)? That
is what my second attached patch does:

$ bin/psql --record-separator-zero --field-separator-zero -At -c 'select 1,2 
union select 3,4'|xargs -0 echo
1 2 3 4

Thoughts?

 I think the most common use of this would be to set the record
 separator from the command line, so we could use a short option
 such as -0 or -z for that.

I agree. The current option names are very unwieldy to type.

-- ams
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 69207c1..691ad14 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -150,12 +150,14 @@ main(int argc, char *argv[])
 
 	parse_psql_options(argc, argv, options);
 
-	if (!pset.popt.topt.fieldSep.separator)
+	if (!pset.popt.topt.fieldSep.separator 
+		!pset.popt.topt.fieldSep.separator_zero)
 	{
 		pset.popt.topt.fieldSep.separator = pg_strdup(DEFAULT_FIELD_SEP);
 		pset.popt.topt.fieldSep.separator_zero = false;
 	}
-	if (!pset.popt.topt.recordSep.separator)
+	if (!pset.popt.topt.recordSep.separator 
+		!pset.popt.topt.recordSep.separator_zero)
 	{
 		pset.popt.topt.recordSep.separator = pg_strdup(DEFAULT_RECORD_SEP);
 		pset.popt.topt.recordSep.separator_zero = false;
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 43ddab1..94535eb 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -357,7 +357,12 @@ print_unaligned_text(const printTableContent *cont, FILE *fout)
 		}
 		/* the last record needs to be concluded with a newline */
 		if (need_recordsep)
-			fputc('\n', fout);
+		{
+			if (cont-opt-recordSep.separator_zero)
+print_separator(cont-opt-recordSep, fout);
+			else
+fputc('\n', fout);
+		}
 	}
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] psql NUL record and field separator

2012-01-14 Thread Peter Eisentraut
Inspired by this question http://stackoverflow.com/questions/6857265 I
have implemented a way to set the psql record and field separators to a
zero byte (ASCII NUL character).  This can be very useful in shell
scripts to have an unambiguous separator.  Other GNU tools such as find,
grep, sort, xargs also support this.  So with this you could for example
do

psql --record-separator-zero -At -c 'select something from somewhere' | xargs 
-0 dosomething

I have thought about two different ways to implement this.  Attempt one
was to make the backslash command option parsing zero-byte proof top to
bottom by using PQExpBuffers, so you could then write \R '\000'.  But
that turned out to be very invasive and complicated.  And worst, you
couldn't use it from the command line, because psql -R '\000' doesn't
work (the octal escape syntax is not used on the command line).

So attempt two, which I present here, is to just have separate syntax to
set the separators to zero bytes.  From the command line it would be
--record-separator-zero and --field-separator-zero, and from within psql
it would be \pset recordsep_zero and \pset fieldsep_zero.  I don't care
much for the verbosity of this, so I'm still thinking about ways to
abbreviate this.  I think the most common use of this would be to set
the record separator from the command line, so we could use a short
option such as -0 or -z for that.

Patch attached.  Comments welcome.
diff --git i/doc/src/sgml/ref/psql-ref.sgml w/doc/src/sgml/ref/psql-ref.sgml
index a9b1ed2..752d6de 100644
--- i/doc/src/sgml/ref/psql-ref.sgml
+++ w/doc/src/sgml/ref/psql-ref.sgml
@@ -193,6 +193,15 @@ PostgreSQL documentation
 /varlistentry
 
 varlistentry
+  termoption--field-separator-zero/option/term
+  listitem
+  para
+  Set the field separator for unaligned output to a zero byte.
+  /para
+  /listitem
+/varlistentry
+
+varlistentry
   termoption-h replaceable class=parameterhostname/replaceable//term
   termoption--host=replaceable class=parameterhostname/replaceable//term
   listitem
@@ -320,6 +329,16 @@ PostgreSQL documentation
 /varlistentry
 
 varlistentry
+  termoption--record-separator-zero/option/term
+  listitem
+  para
+  Set the record separator for unaligned output to a zero byte.  This is
+  useful for interfacing, for example, with literalxargs -0/literal.
+  /para
+  /listitem
+/varlistentry
+
+varlistentry
   termoption-s//term
   termoption--single-step//term
   listitem
@@ -1909,6 +1928,16 @@ lo_import 152801
   /varlistentry
 
   varlistentry
+  termliteralfieldsep_zero/literal/term
+  listitem
+  para
+  Sets the field separator to use in unaligned output format to a zero
+  byte.
+  /para
+  /listitem
+  /varlistentry
+
+  varlistentry
   termliteralfooter/literal/term
   listitem
   para
@@ -2078,6 +2107,16 @@ lo_import 152801
   /varlistentry
 
   varlistentry
+  termliteralrecordsep_zero/literal/term
+  listitem
+  para
+  Sets the record separator to use in unaligned output format to a zero
+  byte.
+  /para
+  /listitem
+  /varlistentry
+
+  varlistentry
   termliteraltableattr/literal (or literalT/literal)/term
   listitem
   para
diff --git i/src/bin/psql/command.c w/src/bin/psql/command.c
index 69fac83..9421a73 100644
--- i/src/bin/psql/command.c
+++ w/src/bin/psql/command.c
@@ -2284,11 +2284,26 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 	{
 		if (value)
 		{
-			free(popt-topt.fieldSep);
-			popt-topt.fieldSep = pg_strdup(value);
+			free(popt-topt.fieldSep.separator);
+			popt-topt.fieldSep.separator = pg_strdup(value);
+			popt-topt.fieldSep.separator_zero = false;
 		}
 		if (!quiet)
-			printf(_(Field separator is \%s\.\n), popt-topt.fieldSep);
+		{
+			if (popt-topt.fieldSep.separator_zero)
+printf(_(Field separator is zero byte.\n));
+			else
+printf(_(Field separator is \%s\.\n), popt-topt.fieldSep.separator);
+		}
+	}
+
+	else if (strcmp(param, fieldsep_zero) == 0)
+	{
+		free(popt-topt.fieldSep.separator);
+		popt-topt.fieldSep.separator = NULL;
+		popt-topt.fieldSep.separator_zero = true;
+		if (!quiet)
+			printf(_(Field separator is zero byte.\n));
 	}
 
 	/* record separator for unaligned text */
@@ -2296,18 +2311,30 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 	{
 		if (value)
 		{
-			free(popt-topt.recordSep);
-			popt-topt.recordSep = pg_strdup(value);
+			free(popt-topt.recordSep.separator);
+			popt-topt.recordSep.separator = pg_strdup(value);
+			popt-topt.recordSep.separator_zero = false;
 		}
 		if (!quiet)
 		{
-			if (strcmp(popt-topt.recordSep, \n) == 0)
+			if (popt-topt.recordSep.separator_zero)
+printf(_(Record separator