Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2024-01-21 Thread Peter Smith
2024-01 Commitfest.

Hi, this patch was marked in CF as "Needs Review", but there has been
no activity on this thread for 14+ months.

Since there seems not much interest, I have changed the status to
"Returned with Feedback" [1]. Feel free to propose a stronger use case
for the patch and add an entry for the same.

==
[1] https://commitfest.postgresql.org/46/2377/

Kind Regards,
Peter Smith.




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-10-27 Thread Justin Pryzby
On Fri, Dec 13, 2019 at 03:03:47PM +1300, Thomas Munro wrote:
> > Actually, I tried using pg_ls_tmpdir(), but it unconditionally masks
> > non-regular files and thus shared filesets.  Maybe that's worth
> > discussion on a new thread ?
> >
> > src/backend/utils/adt/genfile.c
> > /* Ignore anything but regular files */
> > if (!S_ISREG(attrib.st_mode))
> > continue;
> 
> +1, that's worth fixing.

@cfbot: rebased on eddc128be.
>From f641f63a1ff3efd90a3831927c758636cc82d080 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v37 01/11] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6e0425cb3dc..d958c3e74ac 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27700,6 +27700,10 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 platforms only), file creation time stamp (Windows only), and a flag
 indicating if it is a directory.

+   
+   If filename is a link, this function returns information about the file
+   or directory the link refers to.
+   

 This function is restricted to superusers by default, but other users
 can be granted EXECUTE to run the function.
-- 
2.25.1

>From 3f28fa01fa8e01b633e5c2a24ba4deed4abe6053 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v37 02/11] Add tests before changing pg_ls_*

---
 src/test/regress/expected/misc_functions.out | 59 
 src/test/regress/sql/misc_functions.sql  | 15 +
 2 files changed, 74 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index 88bb696ded8..77d285ecc85 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -480,6 +480,65 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No such file or directory
+-- Check that expected columns are present
+select * from pg_ls_archive_statusdir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_ls_logdir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_ls_logicalmapdir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_ls_logicalsnapdir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_ls_replslotdir('') limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_ls_tmpdir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_ls_waldir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_stat_file('.') limit 0;
+ size | access | modification | change | creation | isdir 
+--++--++--+---
+(0 rows)
+
 --
 -- Test replication slot directory functions
 --
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index b07e9e8dbb3..d299f3d8949 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -166,6 +166,21 @@ select count(*) > 0 from
where spcname = 'pg_default') pts
   join pg_database db on pts.pts = db.oid;
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+
+-- Check that expected columns are present
+select * from pg_ls_archive_statusdir() limit 0;
+select * from pg_ls_logdir() limit 0;
+select * from pg_ls_logicalmapdir() limit 0;
+select * from pg_ls_logicalsnapdir() limit 0;
+select * from pg_ls_replslotdir('') limit 0;
+select * from pg_ls_tmpdir() limit 0;
+select * from pg_ls_waldir() limit 0;
+select * from pg_stat_file('.') limit 0;
+
 --
 -- Test replication slot directory functions
 --
-- 
2.25.1

>From 2fcff37d19869c83eab7c8116946854f52a20515 Mon 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-06-23 Thread Justin Pryzby
This thread has been going for 2.5 years, so here's a(nother) recap.

This omits the patches for recursion, since they're optional and evidently a
distraction from the main patches.

On Fri, Dec 27, 2019 at 11:02:20AM -0600, Justin Pryzby wrote:
> The goal is to somehow show tmpfiles (or at least dirs) used by parallel
> workers.

On Thu, Jan 16, 2020 at 08:38:46AM -0600, Justin Pryzby wrote:
> I think if someone wants the full generality, they can do this:
> 
> postgres=# SELECT name, s.size, s.modification, s.isdir FROM (SELECT 
> 'base/pgsql_tmp'p)p, pg_ls_dir(p)name, pg_stat_file(p||'/'||name)s;
>  name | size |  modification  | isdir 
> --+--++---
>  .foo | 4096 | 2020-01-16 08:57:04-05 | t
> 
> In my mind, pg_ls_tmpdir() is for showing tmpfiles, not just a shortcut to
> SELECT pg_ls_dir((SELECT 'base/pgsql_tmp'p)); -- or, for all tablespaces:
> WITH x AS (SELECT format('/PG_%s_%s', 
> split_part(current_setting('server_version'), '.', 1), catalog_version_no) 
> suffix FROM pg_control_system()), y AS (SELECT a, pg_ls_dir(a) AS d FROM 
> (SELECT DISTINCT COALESCE(NULLIF(pg_tablespace_location(oid),'')||suffix, 
> 'base') a FROM pg_tablespace,x)a) SELECT a, pg_ls_dir(a||'/pgsql_tmp') FROM y 
> WHERE d='pgsql_tmp';

On Tue, Mar 10, 2020 at 01:30:37PM -0500, Justin Pryzby wrote:
> I took a step back, and I wondered whether we should add a generic function 
> for
> listing a dir with metadata, possibly instead of changing the existing
> functions.  Then one could do pg_ls_dir_metadata('pg_wal',false,false);
> 
> Since pg8.1, we have pg_ls_dir() to show a list of files.  Since pg10, we've
> had pg_ls_logdir and pg_ls_waldir, which show not only file names but also
> (some) metadata (size, mtime).  And since pg12, we've had pg_ls_tmpfile and
> pg_ls_archive_statusdir, which also show metadata.
> 
> ...but there's no a function which lists the metadata of an directory other
> than tmp, wal, log.
> 
> One can do this:
> |SELECT b.*, c.* FROM (SELECT 'base' a)a, LATERAL (SELECT 
> a||'/'||pg_ls_dir(a.a)b)b, pg_stat_file(b)c;
> ..but that's not as helpful as allowing:
> |SELECT * FROM pg_ls_dir_metadata('.',true,true);
> 
> There's also no function which recurses into an arbitrary directory, so it
> seems shortsighted to provide a function to recursively list a tmpdir.
> 
> Also, since pg_ls_dir_metadata indicates whether the path is a dir, one can
> write a SQL function to show the dir recursively.  It'd be trivial to plug in
> wal/log/tmp (it seems like tmpdirs of other tablespace's are not entirely
> trivial).
> |SELECT * FROM pg_ls_dir_recurse('base/pgsql_tmp');

> It's pretty unfortunate if a function called
> pg_ls_tmpdir hides shared filesets, so maybe it really is best to change that
> (it's new in v12).

On Fri, Mar 13, 2020 at 08:12:32AM -0500, Justin Pryzby wrote:
> The merge conflict presents another opportunity to solicit comments on the new
> approach.  Rather than making "recurse into tmpdir" the end goal:
> 
>   - add a function to show metadata of an arbitrary dir;
>   - add isdir arguments to pg_ls_* functions (including pg_ls_tmpdir but not
> pg_ls_dir).
>   - maybe add pg_ls_dir_recurse, which satisfies the original need;
>   - retire pg_ls_dir (does this work with tuplestore?)
>   - profit
> 
> The alternative seems to be to go back to Alvaro's earlier proposal:
>  - not only add "isdir", but also recurse;
> 
> I think I would insist on adding a general function to recurse into any dir.
> And *optionally* change ps_ls_* to recurse (either by accepting an argument, 
> or
> by making that a separate patch to debate).

On Tue, Mar 31, 2020 at 03:08:12PM -0500, Justin Pryzby wrote:
> The patch intends to fix the issue of "failing to show failed filesets"
> (because dirs are skipped) while also generalizing existing functions (to show
> directories and "isdir" column) and providing some more flexible ones (to list
> file and metadata of a dir, which is currently possible [only] for "special"
> directories, or by recursively calling pg_stat_file).

On Wed, Dec 23, 2020 at 01:17:10PM -0600, Justin Pryzby wrote:
> However, pg_ls_tmpdir is special since it handles tablespace tmpdirs, which it
> seems is not trivial to get from sql:
> 
> +SELECT * FROM (SELECT DISTINCT 
> COALESCE(NULLIF(pg_tablespace_location(b.oid),'')||suffix, 'base/pgsql_tmp') 
> AS dir
> +FROM pg_tablespace b, pg_control_system() pcs,
> +LATERAL format('/PG_%s_%s', left(current_setting('server_version_num'), 2), 
> pcs.catalog_version_no) AS suffix) AS dir,
> +LATERAL pg_ls_dir_recurse(dir) AS a;
> 
> For context, the line of reasoning that led me to this patch series was
> something like this:
> 
> 0) Why can't I list shared tempfiles (dirs) using pg_ls_tmpdir() ?
> 1) Implement recursion for pg_ls_tmpdir();
> 2) Eventually realize that it's silly to implement a function to recurse into
>one particular directory when no general feature exists;
> 3) Implement generic facility;


Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-04-03 Thread Greg Stark
The cfbot is failing testing this patch. It seems... unlikely given
the nature of the patch modifying an admin function that doesn't even
modify the database that it should be breaking a streaming test.
Perhaps the streaming test is using this function in the testing
scaffolding?

[03:19:29.564] # Failed test 'regression tests pass'
[03:19:29.564] # at t/027_stream_regress.pl line 76.
[03:19:29.564] # got: '256'
[03:19:29.564] # expected: '0'
[03:19:29.564] # Looks like you failed 1 test of 5.
[03:19:29.565] [03:19:27] t/027_stream_regress.pl ..
[03:19:29.565] Dubious, test returned 1 (wstat 256, 0x100)
[03:19:29.565] Failed 1/5 subtests




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-31 Thread Justin Pryzby
On Mon, Mar 14, 2022 at 09:37:25PM -0500, Justin Pryzby wrote:
> The original, minimal goal of this patch was to show shared tempdirs in
> pg_ls_tmpfile() - rather than hiding them misleadingly as currently happens.
> 20200310183037.ga29...@telsasoft.com
> 20200313131232.go29...@telsasoft.com
> 
> I added the metadata function 2 years ago since it's silly to show metadata 
> for
> tmpdir but not other, arbitrary directories.
> 20200310183037.ga29...@telsasoft.com
> 20200313131232.go29...@telsasoft.com
> 20201223191710.gr30...@telsasoft.com

I renamed the CF entry to make even more clear the original motive for the
patches (I'm not maintaining the patch to add the metadata function just to
avoid writing a lateral join).

> > In the whole set, improving the docs as of 0001 makes sense, but the
> > change is incomplete.  Most of 0002 also makes sense and should be
> > stable enough.  I am less enthusiastic about any of the other changes
> > proposed and what we can gain from these parts.
> 
> It is frustrating to hear this feedback now, after the patch has gone through
> multiple rewrites over 2 years - based on other positive feedback and review.
> I went to the effort to ask, numerous times, whether to write the patch and 
> how
> its interfaces should look.  Now, I'm hearing that not only the implementation
> but its goals are wrong.  What should I have done to avoid that ?
> 
> 20200503024215.gj28...@telsasoft.com
> 20191227195918.gf12...@telsasoft.com
> 20200116003924.gj26...@telsasoft.com
> 20200908195126.gb18...@telsasoft.com

Michael said he's not enthusiastic about the patch.  But I haven't heard a
suggestion about how else to address the issue that pg_ls_tmpdir() hides shared
filesets.

On Mon, Mar 28, 2022 at 09:13:52PM -0500, Justin Pryzby wrote:
> On Sat, Mar 26, 2022 at 08:23:54PM +0900, Michael Paquier wrote:
> > On Wed, Mar 23, 2022 at 03:17:35PM +0900, Michael Paquier wrote:
> > > FWIW, per my review the bit of the patch set that I found the most
> > > relevant is the addition of a note in the docs of pg_stat_file() about
> > > the case where "filename" is a link, because the code internally uses
> > > stat().   The function name makes that obvious, but that's not
> > > commonly known, I guess.  Please see the attached, that I would be
> > > fine to apply.
> > 
> > Hmm.  I am having second thoughts on this one, as on Windows we rely
> > on GetFileInformationByHandle() for the emulation of stat() in
> > win32stat.c, and it looks like this returns some information about the
> > junction point and not the directory or file this is pointing to, it
> > seems.
> 
> Where did you find that ?  What metadata does it return about the junction
> point ?  We only care about a handful of fields.

Pending your feedback, I didn't modify this beyond your original suggestion -
which seemed like a good one.

This also adds some comments you requested and fixes your coding style
complaints, and causes cfbot to test my proposed patch rather than your doc
patch.

-- 
Justin
>From 354e62f07f345de403d1b2894eba98aec44217b5 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v35 1/7] Document historic behavior of links to directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4001cb2bda5..d01aeec9f88 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27631,6 +27631,10 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 platforms only), file creation time stamp (Windows only), and a flag
 indicating if it is a directory.

+   
+   If filename is a link, this function returns information about the file
+   or directory the link refers to.
+   

 This function is restricted to superusers by default, but other users
 can be granted EXECUTE to run the function.
-- 
2.17.1

>From ee202767ae0d6651c552056424b004be2b110ec1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v35 2/7] Add tests before changing pg_ls_*

---
 src/test/regress/expected/misc_functions.out | 59 
 src/test/regress/sql/misc_functions.sql  | 15 +
 2 files changed, 74 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index 01d1ad0b9a4..45544469af5 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -416,6 +416,65 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-28 Thread Justin Pryzby
On Sat, Mar 26, 2022 at 08:23:54PM +0900, Michael Paquier wrote:
> On Wed, Mar 23, 2022 at 03:17:35PM +0900, Michael Paquier wrote:
> > FWIW, per my review the bit of the patch set that I found the most
> > relevant is the addition of a note in the docs of pg_stat_file() about
> > the case where "filename" is a link, because the code internally uses
> > stat().   The function name makes that obvious, but that's not
> > commonly known, I guess.  Please see the attached, that I would be
> > fine to apply.
> 
> Hmm.  I am having second thoughts on this one, as on Windows we rely
> on GetFileInformationByHandle() for the emulation of stat() in
> win32stat.c, and it looks like this returns some information about the
> junction point and not the directory or file this is pointing to, it
> seems.

Where did you find that ?  What metadata does it return about the junction
point ?  We only care about a handful of fields.




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-26 Thread Michael Paquier
On Wed, Mar 23, 2022 at 03:17:35PM +0900, Michael Paquier wrote:
> FWIW, per my review the bit of the patch set that I found the most
> relevant is the addition of a note in the docs of pg_stat_file() about
> the case where "filename" is a link, because the code internally uses
> stat().   The function name makes that obvious, but that's not
> commonly known, I guess.  Please see the attached, that I would be
> fine to apply.

Hmm.  I am having second thoughts on this one, as on Windows we rely
on GetFileInformationByHandle() for the emulation of stat() in
win32stat.c, and it looks like this returns some information about the
junction point and not the directory or file this is pointing to, it
seems.  At the end, it looks better to keep things simple here, so
let's drop it.
--
Michael


signature.asc
Description: PGP signature


Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-23 Thread Michael Paquier
On Mon, Mar 21, 2022 at 06:28:28PM -0700, Andres Freund wrote:
> Doesn't apply cleanly anymore: http://cfbot.cputube.org/patch_37_2377.log
> 
> Marked as waiting-on-author.

FWIW, per my review the bit of the patch set that I found the most
relevant is the addition of a note in the docs of pg_stat_file() about
the case where "filename" is a link, because the code internally uses
stat().   The function name makes that obvious, but that's not
commonly known, I guess.  Please see the attached, that I would be
fine to apply.
--
Michael
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 8a802fb225..ca111f0745 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27620,6 +27620,10 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 platforms only), file creation time stamp (Windows only), and a flag
 indicating if it is a directory.

+   
+If filename is a link, this function returns
+information about the file or directory the link refers to.
+   

 This function is restricted to superusers by default, but other users
 can be granted EXECUTE to run the function.


signature.asc
Description: PGP signature


Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-21 Thread Andres Freund
Hi,

On 2022-03-09 10:50:45 -0600, Justin Pryzby wrote:
> Rebased over 9e9858389 (Michael may want to look at the tuplestore part?).

Doesn't apply cleanly anymore: http://cfbot.cputube.org/patch_37_2377.log

Marked as waiting-on-author.

Greetings,

Andres Freund




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-14 Thread Michael Paquier
On Mon, Mar 14, 2022 at 09:37:25PM -0500, Justin Pryzby wrote:
> One could argue that most of the pg_ls_* functions aren't needed (including
> 1922d7c6e), since the same things are possible with pg_ls_dir() and
> pg_stat_file().
> |1922d7c6e Add SQL functions to monitor the directory contents of replication 
> slots

This main argument behind this one is monitoring, as the execution to
those functions can be granted at a granular level depending on the
roles doing the disk space lookups.
--
Michael


signature.asc
Description: PGP signature


Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-14 Thread Justin Pryzby
On Mon, Mar 14, 2022 at 01:53:54PM +0900, Michael Paquier wrote:
> On Wed, Mar 09, 2022 at 10:50:45AM -0600, Justin Pryzby wrote:
> > I also changed pg_ls_dir_recurse() to handle concurrent removal of a dir, 
> > which
> > I noticed caused an infrequent failure on CI.  However I'm not including 
> > that
> > here, since the 2nd half of the patch set seems isn't ready due to lstat() 
> > on
> > windows.
> 
> lstat() has been a subject of many issues over the years with our
> internal emulation and issues related to its concurrency, but we use
> it in various areas of the in-core code, so that does not sound like
> an issue to me.  It depends on what you want to do with it in
> genfile.c and which data you'd expect, in addition to the detection of
> junction points for WIN32, I guess.

To avoid cycles, a recursion function would need to know whether to recurse
into a directory or to output that something is isdir=false or type=link, and
avoid recursing into it. 

> pg_ls_dir_recurse(), but that's a WITH RECURSIVE, so we would not
> really need it, do we?

Tom disliked it when I had written it as a recursive CTE, so I rewrote it in C.
129225.1606166...@sss.pgh.pa.us

> Hmm.  I am not convinced that we need a new set of SQL functions as
> presented in 0003 (addition of meta-data in pg_ls_dir()), and
> extensions of 0004 (do the same but for pg_ls_tmpdir) and 0005 (same
> for the other pg_ls* functions).  The changes of pg_ls_dir_files()
> make in my opinion the code harder to follow as the resulting tuple 
> size depends on the type of flag used, and you can already retrieve
> the rest of the information with a join, probably LATERAL, on
> pg_stat_file(), no?  The same can be said about 0007, actually.

Yes, one can get the same information with a lateral join (as I said 2 years
ago).  But it's more helpful to provide a function, rather than leave that to
people to figure out - possibly incorrectly, or badly, like by parsing the
output of COPY FROM PROGRAM 'ls -l'.  The query to handle tablespaces is
particularly obscure:
20200310183037.ga29...@telsasoft.com
20201223191710.gr30...@telsasoft.com

One could argue that most of the pg_ls_* functions aren't needed (including
1922d7c6e), since the same things are possible with pg_ls_dir() and
pg_stat_file().
|1922d7c6e Add SQL functions to monitor the directory contents of replication 
slots

The original, minimal goal of this patch was to show shared tempdirs in
pg_ls_tmpfile() - rather than hiding them misleadingly as currently happens.
20200310183037.ga29...@telsasoft.com
20200313131232.go29...@telsasoft.com

I added the metadata function 2 years ago since it's silly to show metadata for
tmpdir but not other, arbitrary directories.
20200310183037.ga29...@telsasoft.com
20200313131232.go29...@telsasoft.com
20201223191710.gr30...@telsasoft.com

> The addition of isdir for any of the paths related to pg_logical/ and
> the replication slots has also a limited interest, because we know
> already those contents, and these are not directories as far as I
> recall.

Except when we don't, since extensions can do things that core doesn't, as
Fabien pointed out.
alpine.DEB.2.21.2001160927390.30419@pseudo

> In the whole set, improving the docs as of 0001 makes sense, but the
> change is incomplete.  Most of 0002 also makes sense and should be
> stable enough.  I am less enthusiastic about any of the other changes
> proposed and what we can gain from these parts.

It is frustrating to hear this feedback now, after the patch has gone through
multiple rewrites over 2 years - based on other positive feedback and review.
I went to the effort to ask, numerous times, whether to write the patch and how
its interfaces should look.  Now, I'm hearing that not only the implementation
but its goals are wrong.  What should I have done to avoid that ?

20200503024215.gj28...@telsasoft.com
20191227195918.gf12...@telsasoft.com
20200116003924.gj26...@telsasoft.com
20200908195126.gb18...@telsasoft.com




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-14 Thread Michael Paquier
On Mon, Mar 14, 2022 at 01:53:54PM +0900, Michael Paquier wrote:
> +select * from pg_ls_logicalmapdir() limit 0;
> +select * from pg_ls_logicalsnapdir() limit 0;
> +select * from pg_ls_replslotdir('') limit 0;
> +select * from pg_ls_tmpdir() limit 0;
> +select * from pg_ls_waldir() limit 0;
> +select * from pg_stat_file('.') limit 0;
> 
> The rest of the patch set should be stable AFAIK, there are various
> steps when running a checkpoint that makes sure that any of these
> exist, without caring about the value of wal_level.

I was contemplating at 0002 this morning, so see which parts would be
independently useful, and got reminded that we already check the
execution of all those functions in other regression tests, like
test_decoding for the replication slot ones and misc_functions.sql for
the more critical ones, so those extra queries would be just
interesting to check the shape of their SRFs, which is related to the
other patches of the set and limited based on my arguments from
yesterday.

There was one thing that stood out though: there was nothing for the
two options of pg_ls_dir(), called missing_ok and include_dot_dirs.
missing_ok is embedded in one query of pg_rewind, but this is a
counter-measure against concurrent file removals so we cannot rely on
pg_rewind to check that.  And the second option was not run at all.

I have extracted both test cases after rewriting them a bit, and
applied that separately.
--
Michael


signature.asc
Description: PGP signature


Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-13 Thread Michael Paquier
On Wed, Mar 09, 2022 at 10:50:45AM -0600, Justin Pryzby wrote:
> I also changed pg_ls_dir_recurse() to handle concurrent removal of a dir, 
> which
> I noticed caused an infrequent failure on CI.  However I'm not including that
> here, since the 2nd half of the patch set seems isn't ready due to lstat() on
> windows.

lstat() has been a subject of many issues over the years with our
internal emulation and issues related to its concurrency, but we use
it in various areas of the in-core code, so that does not sound like
an issue to me.  It depends on what you want to do with it in
genfile.c and which data you'd expect, in addition to the detection of
junction points for WIN32, I guess.  v34 has no references to
pg_ls_dir_recurse(), but that's a WITH RECURSIVE, so we would not
really need it, do we?

@@ -27618,7 +27618,7 @@ SELECT 
convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users

This is from 0001, and this addition in the documentation is not
completely right.  As pg_stat_file() uses stat() to get back the
information of a file/directory, we'd just follow the link if
specifying one in the input argument.  We could say instead, if we
were to improve the docs, that "If filename is a link, this function
returns information about the file or directory the link refers to."
I would put that as a different paragraph.

+select * from pg_ls_archive_statusdir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)

FWIW, this one is fine as of ValidateXLOGDirectoryStructure() that
would make sure archive_status exists before any connection is
attempted to the cluster.

> +select * from pg_ls_logdir() limit 0;

This test on pg_ls_logdir() would fail if running installcheck on a
cluster that has logging_collector disabled.  So this cannot be
included.

+select * from pg_ls_logicalmapdir() limit 0;
+select * from pg_ls_logicalsnapdir() limit 0;
+select * from pg_ls_replslotdir('') limit 0;
+select * from pg_ls_tmpdir() limit 0;
+select * from pg_ls_waldir() limit 0;
+select * from pg_stat_file('.') limit 0;

The rest of the patch set should be stable AFAIK, there are various
steps when running a checkpoint that makes sure that any of these
exist, without caring about the value of wal_level.

+   
+For each file in the specified directory, list the file and its
+metadata.
+Restricted to superusers by default, but other users can be granted
+EXECUTE to run the function.
+   

What is metadata in this case?  (I have read the code and know what
you mean, but folks only looking at the documentation may be puzzled
by that).  It could be cleaner to use the same tupledesc for any
callers of this function, and return NULL in cases these are not 
adapted.

+   /* check the optional arguments */
+   if (PG_NARGS() == 3)
+   {
+   if (!PG_ARGISNULL(1))
+   {
+   if (PG_GETARG_BOOL(1))
+   flags |= LS_DIR_MISSING_OK;
+   else
+   flags &= ~LS_DIR_MISSING_OK;
+   }
+
+   if (!PG_ARGISNULL(2))
+   {
+   if (PG_GETARG_BOOL(2))
+   flags &= ~LS_DIR_SKIP_DOT_DIRS;
+   else
+   flags |= LS_DIR_SKIP_DOT_DIRS;
+   }
+   }

The subtle different between the false and true code paths of those
arguments 1 and 2 had better be explained?  The bit-wise operations
are slightly different in both cases, so it is not clear which part
does what, and what's the purpose of this logic.

-   SetSingleFuncCall(fcinfo, 0);
+   /* isdir depends on metadata */
+   Assert(!(flags_DIR_ISDIR) || (flags_DIR_METADATA));
+   /* Unreasonable to show isdir and skip dirs */
+   Assert(!(flags_DIR_ISDIR) || !(flags_DIR_SKIP_DIRS));

Incorrect code format.  Spaces required.

+-- This tests the missing_ok parameter, which causes pg_ls_tmpdir to
succeed even if the tmpdir doesn't exist yet
+-- The name='' condition is never true, so the function runs to
completion but returns zero rows.
+-- The query is written to ERROR if the tablespace doesn't exist,
rather than silently failing to call pg_ls_tmpdir()
+SELECT c.* FROM (SELECT oid FROM pg_tablespace b WHERE
b.spcname='regress_tblspace' UNION SELECT 0 ORDER BY 1 DESC LIMIT 1)
AS b , pg_ls_tmpdir(oid) AS c WHERE c.name='Does not exist';

So, here, we have a test that may not actually test what we want to
test.

Hmm.  I am not convinced that we need a new set of SQL functions as
presented in 0003 (addition of meta-data in pg_ls_dir()), and
extensions of 0004 (do the same but for pg_ls_tmpdir) and 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-12 Thread Justin Pryzby
On Sun, Mar 13, 2022 at 09:45:35AM +0900, Michael Paquier wrote:
> On Wed, Mar 09, 2022 at 10:50:45AM -0600, Justin Pryzby wrote:
> > Rebased over 9e9858389 (Michael may want to look at the tuplestore part?).
> 
> Are you referring to the contents of 0003 here that changes the
> semantics of pg_ls_dir_files() regarding its setup call?

Yes, as it has this:

-   SetSingleFuncCall(fcinfo, SRF_SINGLE_USE_EXPECTED); 

  
...
-   SetSingleFuncCall(fcinfo, 0);   

  
...
+   if (flags & LS_DIR_METADATA)

  
+   SetSingleFuncCall(fcinfo, 0);   

  
+   else

  
+   SetSingleFuncCall(fcinfo, SRF_SINGLE_USE_EXPECTED); 

  

-- 
Justin




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-12 Thread Michael Paquier
On Wed, Mar 09, 2022 at 10:50:45AM -0600, Justin Pryzby wrote:
> Rebased over 9e9858389 (Michael may want to look at the tuplestore part?).

Are you referring to the contents of 0003 here that changes the
semantics of pg_ls_dir_files() regarding its setup call?
--
Michael


signature.asc
Description: PGP signature


Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-12 Thread Fabien COELHO



Hello Justin,

Review about v34, up from v32 which I reviewed in January. v34 is an 
updated version of v32, without the part about lstat at the end of the 
series.


All 7 patches apply cleanly.

# part 01

One liner doc improvement about the directory flag.

OK.

# part 02

Add tests for various options on pg_ls_dir, and for pg_stat_file, which were not
exercised before.  "make check" is ok.

OK.

# part 03

This patch adds a new pg_ls_dir_metadata.  Internally, this is an extension of
pg_ls_dir_files function which is used by other pg_ls functions.  Doc ok.

New tests are added which check that the result columns are configured as 
required,
including error cases.

"make check" is ok.

OK.

# part 04

Add a new "isdir" column to "pg_ls_tmpdir" output.  This is a small behavioral
change.

I'm ok with that, however I must say that I'm still unsure why we would 
not jump directly to a "type" char column.  What is wrong with outputing 
'd' or '-' instead of true or false? This way, the interface needs not 
change if "lstat" is used later? ISTM that the interface issue should be 
somehow independent of the implementation issue, and we should choose 
directly the right/best interface?


Independently, the documentation may be clearer about what happens to 
"isdir" when the file is a link? It may say that the behavior may change 
in the future?


About homogeneity, I note that some people may be against adding "isdir" 
to other ls functions. I must say that I cannot see a good argument not to 
do it, and that I hate dealing with systems which are not homogeneous 
because it creates surprises and some loss of time.


"make check" ok.

OK.

# part 05

Add isdir to other ls functions. Doc is updated.

Same as above, I'd prefer a char instead of a bool, as it is more extendable and
future-proof.

OK.

# part 06

This part extends and adds a test for pg_ls_logdir.
"make check" is ok.

OK.

# part 07

This part extends pg_stat_file with more date informations.

"make check" is ok.

OK.

# doc

Overall doc generation is OK.

--
Fabien.




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-10 Thread Fabien COELHO



Hello Justin,

I hope to look at it over the week-end.

--
Fabien Coelho - CRI, MINES ParisTech




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-09 Thread Justin Pryzby
Rebased over 9e9858389 (Michael may want to look at the tuplestore part?).

Fixing a comment typo.

I also changed pg_ls_dir_recurse() to handle concurrent removal of a dir, which
I noticed caused an infrequent failure on CI.  However I'm not including that
here, since the 2nd half of the patch set seems isn't ready due to lstat() on
windows.
>From 47dde043c27840ef43c037f968b268270dc3206d Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v34 01/15] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 8a802fb2253..0be4743e3ed 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27618,7 +27618,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.1

>From d263f772faa6d76fdf301664bc0b436ec417b2cb Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v34 02/15] Add tests before changing pg_ls_*

---
 src/test/regress/expected/misc_functions.out | 59 
 src/test/regress/sql/misc_functions.sql  | 15 +
 2 files changed, 74 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index 8567fcc2b45..b08e7c4a6d0 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -393,6 +393,65 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No such file or directory
+-- Check that expected columns are present
+select * from pg_ls_archive_statusdir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_ls_logdir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_ls_logicalmapdir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_ls_logicalsnapdir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_ls_replslotdir('') limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_ls_tmpdir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_ls_waldir() limit 0;
+ name | size | modification 
+--+--+--
+(0 rows)
+
+select * from pg_stat_file('.') limit 0;
+ size | access | modification | change | creation | isdir 
+--++--++--+---
+(0 rows)
+
 --
 -- Test replication slot directory functions
 --
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 3db3f8bade2..ebb0352ac68 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -132,6 +132,21 @@ select count(*) > 0 from
where spcname = 'pg_default') pts
   join pg_database db on pts.pts = db.oid;
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+
+-- Check that expected columns are present
+select * from pg_ls_archive_statusdir() limit 0;
+select * from pg_ls_logdir() limit 0;
+select * from pg_ls_logicalmapdir() limit 0;
+select * from pg_ls_logicalsnapdir() limit 0;
+select * from pg_ls_replslotdir('') limit 0;
+select * from pg_ls_tmpdir() limit 0;
+select * from pg_ls_waldir() limit 0;
+select * from pg_stat_file('.') limit 0;
+
 --
 -- Test replication slot directory functions
 --
-- 
2.17.1

>From 1c60ac0976f898eacd24881348b5f04b06dc7e1f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 9 Mar 2020 22:40:24 -0500
Subject: 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-01-25 Thread Justin Pryzby
On Sun, Jan 02, 2022 at 01:07:29PM +0100, Fabien COELHO wrote:
> One liner doc improvement to tell that creation time is only available on 
> windows.
> It is indeed not available on Linux.

The change is about the "isflag" flag, not creation time.

 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).

> # part 03
> ISTM that the "if (1) { if (2) continue; } else if(3) { if (4) continue; }" 
> structure"
> may be simplified to "if (1 && 2) continue; if (3 && 4) continue;", at least 
> if
> IS_DIR and IS_REG are incompatible?

No, what you suggested is not the same;

We talked about this before:
https://www.postgresql.org/message-id/20200315212729.gc26...@telsasoft.com

> Otherwise, at least "else if (3 & 4) continue"?

I could write the *final* "else if" like that, but then it would be different
from the previous case.  Which would be confusing and prone to mistakes.

If I wrote it like this, I think it'd just provoke suggestions from someone
else to change it differently:

/* Skip dirs or special files? */
if (S_ISDIR(attrib.st_mode) && !(flags & LS_DIR_SKIP_DIRS))
continue;
if (!S_ISDIR(attrib.st_mode) && !S_ISREG(attrib.st_mode) && !(flags & 
LS_DIR_SKIP_SPECIAL)
continue;

...
<< Why don't you use "else if" instead of "if (a){} if (!a && b){}" >>

I'm going to leave it up to a committer.

> The ifdef WIN32 (which probably detects windows 64 bits…) overwrites 
> values[3]. ISTM
> it could be reordered so that there is no overwrite, and simpler single 
> assignements.
> 
>   #ifndef WIN32
> v = ...;
>   #else
> v = ... ? ... : ...;
>   #endif

I changed this but without using nested conditionals.

> Add a new "isdir" column to "pg_ls_tmpdir" output.  This is a small behavioral
> change.  I'm ok with it, however I'm unsure why we would not jump directly to
> the "type" char column done later in the patch series.

Because that depends on lstat().

> ISTM all such functions
> should be extended the same way for better homogeneity? That would also impact
> "waldir", "archive_status", "logical_*", "replslot" variants. "make check" ok.

I agree that makes sense, however others have expressed the opposite opinion.
https://www.postgresql.org/message-id/CALj2ACWtrt5EkHrY4WAZ4Cv42SidXAwpeQJU021bxaKpjmbGfA%40mail.gmail.com

The original motive for the patch was that pg_ls_tmpdir doesn't show shared
filesets.  This fixes that essential problem without immediately dragging
everything else along.  I think it's more likely that a committer would merge
them both.  But I don't know, and it's easy to combine patches if desired.

> This patch applies my previous advice:-) ISTM that parts 4 and 5 should be one
> single patch. The test changes show that only waldir has a test. Would it be
> possible to add minimal tests to other variants as well?  "make check" ok.

I have added tests, although some are duplicative.

> This part extends and adds a test for pg_ls_logdir. ISTM that it should
> be merged with the previous patches.  "make check" is ok.

It's seperate to allow writing a separate commit message since it does
something unrelated to the other patches.  What other patch would it would be
merged with ?
| v32-0006-pg_ls_logdir-to-ignore-error-if-initial-top-dir-.patch 

> ISTM that the documentation should be clear about windows vs unix/cygwin 
> specific
> data provided (change/creation).

I preferred to refer to pg_stat_file rather than repeat it for all 7 functions
currently in v15, (and future functions added for new, toplevel dirs).

> # part 11
> 
> This part adds a recurse option. Why not. However, the true value does not
> seem to be tested? "make check" is ok.

WDYM the true value?  It's tested like:

+-- Exercise recursion
+select path, filename, type from pg_ls_dir_metadata('.', true, false, true) 
where
+path in ('base', 'base/pgsql_tmp', 'global', 'global/pg_control', 
'global/pg_filenode.map', 'PG_VERSION', 'pg_multixact', 'pg_multixact/members', 
'pg_multixact/offsets', 'pg_wal', 'pg_wal/archive_status')
+-- (type='d' or 
path~'^(global/.*|PG_VERSION|postmaster\.opts|postmaster\.pid|pg_logical/replorigin_checkpoint)$')
 and filename!~'[0-9]'
+order by path collate "C", filename collate "C";
+  path  |filename | type 
++-+--
+ PG_VERSION | PG_VERSION  | -
+ base   | base| d
+ base/pgsql_tmp | pgsql_tmp   | d
...

-- 
Justin
>From 366edbf62b85a5471f5c3ebeb28a45f2692f9815 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v33 01/11] Document historic behavior of links to
 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-01-17 Thread Julien Rouhaud
Hi,

On Sun, Jan 02, 2022 at 02:55:04PM +0100, Fabien COELHO wrote:
> 
> > Here is my review about v32:
> 
> I forgot to tell that doc generation for the cumulated changes also works.

Unfortunately the patchset doesn't apply anymore:

http://cfbot.cputube.org/patch_36_2377.log
=== Applying patches on top of PostgreSQL commit ID 
4483b2cf29bfe8091b721756928ccbe31c5c8e14 ===
=== applying patch 
./v32-0003-Add-pg_ls_dir_metadata-to-list-a-dir-with-file-m.patch
[...]
patching file src/test/regress/expected/misc_functions.out
Hunk #1 succeeded at 274 (offset 7 lines).
patching file src/test/regress/expected/tablespace.out
Hunk #1 FAILED at 16.
1 out of 1 hunk FAILED -- saving rejects to file 
src/test/regress/expected/tablespace.out.rej

Justin, could you send a rebased version?




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-01-02 Thread Fabien COELHO




Here is my review about v32:


I forgot to tell that doc generation for the cumulated changes also works.

--
Fabien.




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-01-02 Thread Fabien COELHO


Hello Justin,

Happy new year!


I think the 2nd half of the patches are still waiting for fixes to lstat() on
windows.


Not your problem?

Here is my review about v32:

All patches apply cleanly.

# part 01

One liner doc improvement to tell that creation time is only available on 
windows.
It is indeed not available on Linux.

OK.

# part 02

Add tests for various options on pg_ls_dir, and for pg_stat_file, which were not
exercised before.  "make check" is ok.

OK.

# part 03

This patch adds a new pg_ls_dir_metadata.  Internally, this is an extension of
pg_ls_dir_files function which is used by other pg_ls functions.  Doc ok.

About the code:

ISTM that the "if (1) { if (2) continue; } else if(3) { if (4) continue; }" 
structure"
may be simplified to "if (1 && 2) continue; if (3 && 4) continue;", at least if
IS_DIR and IS_REG are incompatible? Otherwise, at least "else if (3 & 4) 
continue"?

The ifdef WIN32 (which probably detects windows 64 bits…) overwrites values[3]. 
ISTM
it could be reordered so that there is no overwrite, and simpler single 
assignements.

  #ifndef WIN32
v = ...;
  #else
v = ... ? ... : ...;
  #endif

New tests are added which check that the result columns are configured as 
required,
including error cases.

"make check" is ok.

OK.

# part 04

Add a new "isdir" column to "pg_ls_tmpdir" output.  This is a small behavioral
change.  I'm ok with it, however I'm unsure why we would not jump directly to
the "type" char column done later in the patch series.  ISTM all such functions
should be extended the same way for better homogeneity? That would also impact
"waldir", "archive_status", "logical_*", "replslot" variants. "make check" ok.

OK.

# part 05

This patch applies my previous advice:-) ISTM that parts 4 and 5 should be one
single patch. The test changes show that only waldir has a test. Would it be
possible to add minimal tests to other variants as well?  "make check" ok.

I'd consider add such tests with part 02.

OK.

# part 06

This part extends and adds a test for pg_ls_logdir. ISTM that it should
be merged with the previous patches.  "make check" is ok.

OK.

# part 07

This part extends pg_stat_file with more date informations.

ISTM that the documentation should be clear about windows vs unix/cygwin 
specific
data provided (change/creation).

The code adds a new value_from_stat function to avoid code duplication.
Fine with me.

All pg_ls_*dir functions are impacted. Fine with me.

"make check" is ok.

OK.

# part 08

This part substitutes lstat to stat. Fine with me.  "make check" is ok.
I guess that lstat does something under windows even if the concept of
link is somehow different there. Maybe the doc should say so somewhere?

OK.

# part 09

This part switches the added "isdir" to a "type" column.  "make check" is ok.
This is a definite improvement.

OK.

# part 10

This part adds a redundant "isdir" column. I do not see the point.
"make check" is ok.

NOT OK.

# part 11

This part adds a recurse option. Why not. However, the true value does not
seem to be tested? "make check" is ok.

My opinion is unclear.

Overall, ignoring part 10, this makes a definite improvement to postgres ls
capabilities. I do not seen any reason why not to add those.

--
Fabien.

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2021-12-23 Thread Justin Pryzby
On Thu, Dec 23, 2021 at 09:14:18AM -0400, Fabien COELHO wrote:
> It seems that the v31 patch does not apply anymore:
> 
>   postgresql> git apply 
> ~/v31-0001-Document-historic-behavior-of-links-to-directori.patch
>   error: patch failed: doc/src/sgml/func.sgml:27410
>   error: doc/src/sgml/func.sgml: patch does not apply

Thanks for continuing to follow this patch ;)

I fixed a conflict with output/tablespace from d1029bb5a et seq.
I'm not sure why you got a conflict with 0001, though.

I think the 2nd half of the patches are still waiting for fixes to lstat() on
windows.

You complained before that there were too many patches, and I can see how it
might be a pain depending on your workflow.  But the division is deliberate.
Dealing with patchsets is easy for me: I use "mutt" and for each patch
attachment, I type "|git am" (or |git am -3).
>From b8ed86daad1bf9fbb647f7c007130bb0878a2a94 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v32 01/11] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e58efce5865..b5c1befe627 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27495,7 +27495,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From 88b324a6877ce916589872f5067bd5810e60608e Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v32 02/11] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 24 
 src/test/regress/sql/misc_functions.sql  |  8 +++
 2 files changed, 32 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index 1013d17f87d..830de507e7c 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -243,6 +243,30 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No such file or directory
+-- Check that expected columns are present
+select * from pg_stat_file('.') limit 0;
+ size | access | modification | change | creation | isdir 
+--++--++--+---
+(0 rows)
+
 --
 -- Test replication slot directory functions
 --
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 7ab9b2a1509..422a1369aeb 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -91,6 +91,14 @@ select count(*) > 0 from
where spcname = 'pg_default') pts
   join pg_database db on pts.pts = db.oid;
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+
+-- Check that expected columns are present
+select * from pg_stat_file('.') limit 0;
+
 --
 -- Test replication slot directory functions
 --
-- 
2.17.0

>From 6b7b4a3a7f4fe3edcb5a67f467b10543fa47c343 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 9 Mar 2020 22:40:24 -0500
Subject: [PATCH v32 03/11] Add pg_ls_dir_metadata to list a dir with file
 metadata..

Generalize pg_ls_dir_files and retire pg_ls_dir

Need catversion bumped?
---
 doc/src/sgml/func.sgml   |  21 ++
 src/backend/catalog/system_functions.sql |   1 +
 src/backend/utils/adt/genfile.c  | 239 +++
 src/include/catalog/pg_proc.dat  |  12 +
 src/test/regress/expected/misc_functions.out |  24 ++
 src/test/regress/expected/tablespace.out |   8 +
 src/test/regress/sql/misc_functions.sql  |  11 +
 src/test/regress/sql/tablespace.sql  

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2021-12-23 Thread Fabien COELHO



Hello Justin,

It seems that the v31 patch does not apply anymore:

  postgresql> git apply 
~/v31-0001-Document-historic-behavior-of-links-to-directori.patch
  error: patch failed: doc/src/sgml/func.sgml:27410
  error: doc/src/sgml/func.sgml: patch does not apply

--
Fabien.




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2021-11-23 Thread Justin Pryzby
On Mon, Nov 22, 2021 at 07:17:01PM +, Bossart, Nathan wrote:
> In an attempt to get this patch set off the ground again, I took a
> look at the first 5 patches.

> I haven't looked at the following patches too much, but I'm getting
> the idea that they might address a lot of the feedback above and that
> the first bunch of patches are more like staging patches that add the
> abilities without changing the behavior.  I wonder if just going
> straight to the end goal behavior might simplify the patch set a bit.
> I can't say I feel too strongly about this, but I figure I'd at least
> share my thoughts.

Thanks for looking.

The patches are separate since the early patches are the most necessary, least
disputable parts, to allow the possibility of (say) chaging pg_ls_tmpdir() 
without
changing other functions, since pg_ls_tmpdir was was original motivation behind
this whole thread.

In a recent thread, Bharath Rupireddy added pg_ls functions for the logical
dirs, but expressed a preference not to add the metadata columns.  I still
think that at least "isdir" should be added to all the "ls" functions, since
it's easy to SELECT the columns you want, and a bit of a pain to write the
corresponding LATERAL query: 'dir' AS dir, pg_ls_dir(dir) AS ls,
pg_stat_file(ls) AS st.  I think it would be strange if pg_ls_tmpdir() were to
return a different set of columns than the other functions, even though admins
or extensions might have created dirs or other files in those directories.

Tom pointed out that we don't have a working lstat() for windows, so then it
seems like we're not yet ready to show file "types" (we'd show the type of the
link target, which is sometimes what's wanted, but not usually what "ls" would
show), nor ready to implement recurse.

As before:

On Tue, Apr 06, 2021 at 11:01:31AM -0500, Justin Pryzby wrote:
> The first handful of patches address the original issue, and I think could be
> "ready":
> 
> $ git log --oneline origin..pg-ls-dir-new |tac
> ... Document historic behavior of links to directories..
> ... Add tests on pg_ls_dir before changing it
> ... Add pg_ls_dir_metadata to list a dir with file metadata..
> ... pg_ls_tmpdir to show directories and "isdir" argument..
> ... pg_ls_*dir to show directories and "isdir" column..
> 
> These others are optional:
> ... pg_ls_logdir to ignore error if initial/top dir is missing..
> ... pg_ls_*dir to return all the metadata from pg_stat_file..
> 
> ..and these maybe requires more work for lstat on windows:
> ... pg_stat_file and pg_ls_dir_* to use lstat()..
> ... pg_ls_*/pg_stat_file to show file *type*..
> ... Preserve pg_stat_file() isdir..
> ... Add recursion option in pg_ls_dir_files..

rebased on 1922d7c6e1a74178bd2f1d5aa5a6ab921b3fcd34

-- 
Justin
>From 6bdb81194999cfe6b0ba4d850e4f31d022655a5b Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v31 01/11] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 74d3087a72..d36479d86d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27410,7 +27410,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From d6068e34df8dd9ca8db1959010f1b41933732775 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v31 02/11] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 24 
 src/test/regress/sql/misc_functions.sql  |  8 +++
 2 files changed, 32 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index 1013d17f87..830de507e7 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -243,6 +243,30 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2021-11-22 Thread Bossart, Nathan
In an attempt to get this patch set off the ground again, I took a
look at the first 5 patches.

0001: This one is a very small documentation update for pg_stat_file
to point out that isdir will be true for symbolic links to
directories.  Given this is true, I think the patch looks good.

0002: This patch adds some very basic testing for pg_ls_dir().  The
only comment that I have for this one is that I might also check
whether '..' is included in the results of the include_dot_dirs tests.
The docs specifically note that include_dot_dirs indicates whether
both '.' and '..' are included, so IMO we might as well verify that.

0003: This one didn't apply cleanly until I used 'git apply -3', so it
likely needs a rebase.  This patch introduces the pg_ls_dir_metadata()
function, which appears to just be pg_ls_dir() with some additional
columns for the size and modification time.  My initial reaction to
this one is that we should just add those columns to pg_ls_dir() to
match all the other pg_ls_* functions (and not bother attempting to
maintain historic behavior for things like hidden and special files).
I believe there is some existing discussion on this point upthread, so
perhaps there is a good reason to make a new function.  In any case, I
like the idea of having pg_ls_dir() use pg_ls_dir_files() internally
like the rest of the pg_ls_* functions.

0004: This one changes pg_ls_tmpdir to show directories as well.  I
think this is a reasonable change.  On it's own, the patch looks
alright, although it might look different if my suggestions for 0003
were followed.

0005: This one adjusts the rest of the pg_ls_* functions to show
directories.  Again, I think this is a reasonable change.  As noted in
0003, I think it'd be alright just to have all the pg_ls_* functions
show special and hidden files as well.  It's simple enough already to
filter our files that start with '.' if necessary, and I'm not sure
there's any strong reason for leaving out special files.  If special
files are included, perhaps isdir should be changed to indicate the
file type instead of just whether it is a directory.  (Reading ahead,
it looks like this is what 0009 might do.)

I haven't looked at the following patches too much, but I'm getting
the idea that they might address a lot of the feedback above and that
the first bunch of patches are more like staging patches that add the
abilities without changing the behavior.  I wonder if just going
straight to the end goal behavior might simplify the patch set a bit.
I can't say I feel too strongly about this, but I figure I'd at least
share my thoughts.

Nathan



Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2021-07-02 Thread Justin Pryzby
On Tue, Apr 06, 2021 at 11:01:31AM -0500, Justin Pryzby wrote:
> On Wed, Dec 23, 2020 at 01:17:10PM -0600, Justin Pryzby wrote:
> > On Mon, Nov 23, 2020 at 04:14:18PM -0500, Tom Lane wrote:
> > > * I noticed that you did s/stat/lstat/.  That's fine on Unix systems,
> > > but it won't have any effect on Windows systems (cf bed90759f),
> > > which means that we'll have to document a platform-specific behavioral
> > > difference.  Do we want to go there?
> > >
> > > Maybe this patch needs to wait on somebody fixing our lack of real 
> > > lstat() on Windows.
> > 
> > I think only the "top" patches depend on lstat (for the "type" column and
> > recursion, to avoid loops).  The initial patches are independently useful, 
> > and
> > resolve the original issue of hiding tmpdirs.  I've rebased and re-arranged 
> > the
> > patches to reflect this.
> 
> I said that, but then failed to attach the re-arranged patches.
> Now I also renumbered OIDs following best practice.
> 
> The first handful of patches address the original issue, and I think could be
> "ready":
> 
> $ git log --oneline origin..pg-ls-dir-new |tac
> ... Document historic behavior of links to directories..
> ... Add tests on pg_ls_dir before changing it
> ... Add pg_ls_dir_metadata to list a dir with file metadata..
> ... pg_ls_tmpdir to show directories and "isdir" argument..
> ... pg_ls_*dir to show directories and "isdir" column..
> 
> These others are optional:
> ... pg_ls_logdir to ignore error if initial/top dir is missing..
> ... pg_ls_*dir to return all the metadata from pg_stat_file..
> 
> ..and these maybe requires more work for lstat on windows:
> ... pg_stat_file and pg_ls_dir_* to use lstat()..
> ... pg_ls_*/pg_stat_file to show file *type*..
> ... Preserve pg_stat_file() isdir..
> ... Add recursion option in pg_ls_dir_files..

@cfbot: rebased
>From 4b34d394af0253dc3be2e47cb2e8ccd286eea0a3 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v30 01/11] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6388385edc..7a830f0684 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27004,7 +27004,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From effa738721a008a5d0c65defecfb7b36ab8f33b3 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v30 02/11] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 24 
 src/test/regress/sql/misc_functions.sql  |  8 +++
 2 files changed, 32 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index e845042d38..ea0fc48dbd 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -214,6 +214,30 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No such file or directory
+-- Check that expected columns are present
+select * from pg_stat_file('.') limit 0;
+ size | access | modification | change | creation | isdir 
+--++--++--+---
+(0 rows)
+
 --
 -- Test adding a support function to a subject function
 --
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index a398349afc..eb6ac12ab4 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -69,6 +69,14 @@ select count(*) > 0 from
where spcname = 'pg_default') pts
   join pg_database db on pts.pts = db.oid;
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+select pg_ls_dir('does not exist', true, 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2021-04-08 Thread Justin Pryzby
Breaking with tradition, the previous patch included one too *few* changes, and
failed to resolve the OID collisions.
>From d3d33b25e8571f5a2a3124e85164321f19ca Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v28 01/11] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 0606b6a9aa..aa0dcde886 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27018,7 +27018,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From c6300e4cc9fd6826530ac5a5c49eaac7f02c49c0 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v28 02/11] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 24 
 src/test/regress/sql/misc_functions.sql  |  8 +++
 2 files changed, 32 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index e845042d38..ea0fc48dbd 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -214,6 +214,30 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No such file or directory
+-- Check that expected columns are present
+select * from pg_stat_file('.') limit 0;
+ size | access | modification | change | creation | isdir 
+--++--++--+---
+(0 rows)
+
 --
 -- Test adding a support function to a subject function
 --
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index a398349afc..eb6ac12ab4 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -69,6 +69,14 @@ select count(*) > 0 from
where spcname = 'pg_default') pts
   join pg_database db on pts.pts = db.oid;
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+
+-- Check that expected columns are present
+select * from pg_stat_file('.') limit 0;
+
 --
 -- Test adding a support function to a subject function
 --
-- 
2.17.0

>From 5d352f9fee18f44a03f5c373b4aec71f88933402 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 9 Mar 2020 22:40:24 -0500
Subject: [PATCH v28 03/11] Add pg_ls_dir_metadata to list a dir with file
 metadata..

Generalize pg_ls_dir_files and retire pg_ls_dir

Need catversion bumped?
---
 doc/src/sgml/func.sgml   |  21 ++
 src/backend/catalog/system_views.sql |   1 +
 src/backend/utils/adt/genfile.c  | 233 +++
 src/include/catalog/pg_proc.dat  |  12 +
 src/test/regress/expected/misc_functions.out |  24 ++
 src/test/regress/input/tablespace.source |   5 +
 src/test/regress/output/tablespace.source|   8 +
 src/test/regress/sql/misc_functions.sql  |  11 +
 8 files changed, 222 insertions(+), 93 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index aa0dcde886..507a6d73f8 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25821,6 +25821,27 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());

   
 
+  
+   
+
+ pg_ls_dir_metadata
+
+pg_ls_dir_metadata ( dirname text
+, missing_ok boolean,
+include_dot_dirs boolean  )
+setof record
+( filename text,
+size bigint,
+modification timestamp with time zone )
+   
+   
+For each file in the specified directory, list the file and its
+

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2021-04-06 Thread Justin Pryzby
On Wed, Dec 23, 2020 at 01:17:10PM -0600, Justin Pryzby wrote:
> On Mon, Nov 23, 2020 at 04:14:18PM -0500, Tom Lane wrote:
> > * I noticed that you did s/stat/lstat/.  That's fine on Unix systems,
> > but it won't have any effect on Windows systems (cf bed90759f),
> > which means that we'll have to document a platform-specific behavioral
> > difference.  Do we want to go there?
> >
> > Maybe this patch needs to wait on somebody fixing our lack of real lstat() 
> > on Windows.
> 
> I think only the "top" patches depend on lstat (for the "type" column and
> recursion, to avoid loops).  The initial patches are independently useful, and
> resolve the original issue of hiding tmpdirs.  I've rebased and re-arranged 
> the
> patches to reflect this.

I said that, but then failed to attach the re-arranged patches.
Now I also renumbered OIDs following best practice.

The first handful of patches address the original issue, and I think could be
"ready":

$ git log --oneline origin..pg-ls-dir-new |tac
... Document historic behavior of links to directories..
... Add tests on pg_ls_dir before changing it
... Add pg_ls_dir_metadata to list a dir with file metadata..
... pg_ls_tmpdir to show directories and "isdir" argument..
... pg_ls_*dir to show directories and "isdir" column..

These others are optional:
... pg_ls_logdir to ignore error if initial/top dir is missing..
... pg_ls_*dir to return all the metadata from pg_stat_file..

..and these maybe requires more work for lstat on windows:
... pg_stat_file and pg_ls_dir_* to use lstat()..
... pg_ls_*/pg_stat_file to show file *type*..
... Preserve pg_stat_file() isdir..
... Add recursion option in pg_ls_dir_files..

-- 
Justin
>From c8a7cbd4ebbe903a8f373d637543b86b0ffa6dfd Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v27 01/11] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c6a45d9e55..977b75a531 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -26990,7 +26990,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From c571e40d0938d897500fbd6fb7a88cb52139da92 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v27 02/11] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 24 
 src/test/regress/sql/misc_functions.sql  |  8 +++
 2 files changed, 32 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index e845042d38..ea0fc48dbd 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -214,6 +214,30 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No such file or directory
+-- Check that expected columns are present
+select * from pg_stat_file('.') limit 0;
+ size | access | modification | change | creation | isdir 
+--++--++--+---
+(0 rows)
+
 --
 -- Test adding a support function to a subject function
 --
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index a398349afc..eb6ac12ab4 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -69,6 +69,14 @@ select count(*) > 0 from
where spcname = 'pg_default') pts
   join pg_database db on pts.pts = db.oid;
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+
+-- Check that expected columns are present

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2021-03-15 Thread David Steele

On 12/23/20 2:27 PM, Stephen Frost wrote:

* Justin Pryzby (pry...@telsasoft.com) wrote:

On Mon, Nov 23, 2020 at 04:14:18PM -0500, Tom Lane wrote:

* I don't think it's okay to change the existing signatures of
pg_ls_logdir() et al.  Even if you can make an argument that it's
not too harmful to add more output columns, replacing pg_stat_file's
isdir output with something of a different name and datatype is most
surely not OK --- there is no possible way that doesn't break existing
user queries.

I think possibly a more acceptable approach is to leave these functions
alone but add documentation explaining how to get the additional info.
You could say things along the lines of "pg_ls_waldir() is the same as
pg_ls_dir_metadata('pg_wal') except for showing fewer columns."


On Mon, Nov 23, 2020 at 06:06:19PM -0500, Tom Lane wrote:

I'm mostly concerned about removing the isdir output of pg_stat_file().
Maybe we could compromise to the extent of keeping that, allowing it
to be partially duplicative of a file-type-code output column.


On Tue, Nov 24, 2020 at 11:53:22AM -0500, Stephen Frost wrote:

I don't have any particular issue with keeping isdir as a convenience
column.  I agree it'll now be a bit duplicative but that seems alright.


Maybe we should do what Tom said, and leave pg_ls_* unchanged, but also mark
them as deprecated in favour of:
| pg_ls_dir_metadata(dir), dir={'pg_wal/archive_status', 'log', 'pg_wal', 
'base/pgsql_tmp'}


Haven't really time to review the patches here in detail right now
(maybe next month), but in general, I dislike marking things as
deprecated.  

Stephen, are you still planning to review these patches?

Regards,
--
-David
da...@pgmasters.net




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-12-23 Thread Stephen Frost
Greetings,

* Justin Pryzby (pry...@telsasoft.com) wrote:
> On Mon, Nov 23, 2020 at 04:14:18PM -0500, Tom Lane wrote:
> > * I don't think it's okay to change the existing signatures of
> > pg_ls_logdir() et al.  Even if you can make an argument that it's
> > not too harmful to add more output columns, replacing pg_stat_file's
> > isdir output with something of a different name and datatype is most
> > surely not OK --- there is no possible way that doesn't break existing
> > user queries.
> > 
> > I think possibly a more acceptable approach is to leave these functions
> > alone but add documentation explaining how to get the additional info.
> > You could say things along the lines of "pg_ls_waldir() is the same as
> > pg_ls_dir_metadata('pg_wal') except for showing fewer columns."
> 
> On Mon, Nov 23, 2020 at 06:06:19PM -0500, Tom Lane wrote:
> > I'm mostly concerned about removing the isdir output of pg_stat_file().
> > Maybe we could compromise to the extent of keeping that, allowing it
> > to be partially duplicative of a file-type-code output column.
> 
> On Tue, Nov 24, 2020 at 11:53:22AM -0500, Stephen Frost wrote:
> > I don't have any particular issue with keeping isdir as a convenience
> > column.  I agree it'll now be a bit duplicative but that seems alright.
> 
> Maybe we should do what Tom said, and leave pg_ls_* unchanged, but also mark
> them as deprecated in favour of:
> | pg_ls_dir_metadata(dir), dir={'pg_wal/archive_status', 'log', 'pg_wal', 
> 'base/pgsql_tmp'}

Haven't really time to review the patches here in detail right now
(maybe next month), but in general, I dislike marking things as
deprecated.  If we don't want to change them and we're happy to continue
supporting them as-is (which is what 'deprecated' really means), then we
can just do so- nothing stops us from that.  If we don't think the
current API makes sense, for whatever reason, we can just change that-
there's no need for a 'deprecation period', as we already have major
versions and support each major version for 5 years.

I haven't particularly strong feelings one way or the other regarding
these particular functions.  If you asked which way I leaned, I'd say
that I'd rather redefine the functions to make more sense and to be easy
to use for people who would like to use them.  I wouldn't object to new
functions to provide that either though.  I don't think there's all that
much code or that it's changed often enough to be a big burden to keep
both, but that's more feeling than anything based in actual research at
this point.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-12-23 Thread Justin Pryzby
On Mon, Nov 23, 2020 at 04:14:18PM -0500, Tom Lane wrote:
> * I don't think it's okay to change the existing signatures of
> pg_ls_logdir() et al.  Even if you can make an argument that it's
> not too harmful to add more output columns, replacing pg_stat_file's
> isdir output with something of a different name and datatype is most
> surely not OK --- there is no possible way that doesn't break existing
> user queries.
> 
> I think possibly a more acceptable approach is to leave these functions
> alone but add documentation explaining how to get the additional info.
> You could say things along the lines of "pg_ls_waldir() is the same as
> pg_ls_dir_metadata('pg_wal') except for showing fewer columns."

On Mon, Nov 23, 2020 at 06:06:19PM -0500, Tom Lane wrote:
> I'm mostly concerned about removing the isdir output of pg_stat_file().
> Maybe we could compromise to the extent of keeping that, allowing it
> to be partially duplicative of a file-type-code output column.

On Tue, Nov 24, 2020 at 11:53:22AM -0500, Stephen Frost wrote:
> I don't have any particular issue with keeping isdir as a convenience
> column.  I agree it'll now be a bit duplicative but that seems alright.

Maybe we should do what Tom said, and leave pg_ls_* unchanged, but also mark
them as deprecated in favour of:
| pg_ls_dir_metadata(dir), dir={'pg_wal/archive_status', 'log', 'pg_wal', 
'base/pgsql_tmp'}

However, pg_ls_tmpdir is special since it handles tablespace tmpdirs, which it
seems is not trivial to get from sql:

+SELECT * FROM (SELECT DISTINCT 
COALESCE(NULLIF(pg_tablespace_location(b.oid),'')||suffix, 'base/pgsql_tmp') AS 
dir
+FROM pg_tablespace b, pg_control_system() pcs,
+LATERAL format('/PG_%s_%s', left(current_setting('server_version_num'), 2), 
pcs.catalog_version_no) AS suffix) AS dir,
+LATERAL pg_ls_dir_recurse(dir) AS a;

For context, the line of reasoning that led me to this patch series was
something like this:

0) Why can't I list shared tempfiles (dirs) using pg_ls_tmpdir() ?
1) Implement recursion for pg_ls_tmpdir();
2) Eventually realize that it's silly to implement a function to recurse into
   one particular directory when no general feature exists;
3) Implement generic facility;

> * I noticed that you did s/stat/lstat/.  That's fine on Unix systems,
> but it won't have any effect on Windows systems (cf bed90759f),
> which means that we'll have to document a platform-specific behavioral
> difference.  Do we want to go there?
>
> Maybe this patch needs to wait on somebody fixing our lack of real lstat() on 
> Windows.

I think only the "top" patches depend on lstat (for the "type" column and
recursion, to avoid loops).  The initial patches are independently useful, and
resolve the original issue of hiding tmpdirs.  I've rebased and re-arranged the
patches to reflect this.

-- 
Justin
>From 1fffd1c7af05298295e96fff189caf6eb6cc1539 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v26 01/11] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2707e757ca..806a7b8a9a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -26559,7 +26559,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From 19c080b6163833323c53e48519f850d21d3a618d Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 18:59:16 -0500
Subject: [PATCH v26 02/11] pg_stat_file and pg_ls_dir_* to use lstat()..

pg_ls_dir_* will now skip (no longer show) symbolic links, same as other
non-regular file types, as we advertize we do since 8b6d94cf6.  That seems to
be the intented behavior, since irregular file types are 1) less portable; and,
2) we don't currently show a file's type except for "bool is_dir".

pg_stat_file will now 1) show metadata of links themselves, rather than their
target; and, 2) specifically, show links to directories with "is_dir=false";
and, 3) not error on broken symlinks.
---
 doc/src/sgml/func.sgml  | 2 +-
 src/backend/utils/adt/genfile.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 806a7b8a9a..2707e757ca 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -26559,7 +26559,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-12-09 Thread Justin Pryzby
On Fri, Dec 04, 2020 at 12:23:23PM -0500, Tom Lane wrote:
> Justin Pryzby  writes:
> [ v24-0001-Document-historic-behavior-of-links-to-directori.patch ]
> 
> The cfbot is unhappy with one of the test cases you added:

> Maybe it could be salvaged by reversing the sense of the WHERE condition
> so that instead of trying to blacklist stuff, you whitelist just a small
> number of files that should certainly be there.

Yes, I had noticed this one.

-- 
Justin
>From 00d390ac3754db140bb98c5b09bc38e3e9b0e45c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v25 01/11] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index df29af6371..7cafdb9107 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25897,7 +25897,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From 9ac2f2cf76a53348c505482f8df27ab0464006d1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 18:59:16 -0500
Subject: [PATCH v25 02/11] pg_stat_file and pg_ls_dir_* to use lstat()..

pg_ls_dir_* will now skip (no longer show) symbolic links, same as other
non-regular file types, as we advertize we do since 8b6d94cf6.  That seems to
be the intented behavior, since irregular file types are 1) less portable; and,
2) we don't currently show a file's type except for "bool is_dir".

pg_stat_file will now 1) show metadata of links themselves, rather than their
target; and, 2) specifically, show links to directories with "is_dir=false";
and, 3) not error on broken symlinks.
---
 doc/src/sgml/func.sgml  | 2 +-
 src/backend/utils/adt/genfile.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7cafdb9107..df29af6371 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25897,7 +25897,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory (or a symbolic link to a directory).
+indicating if it is a directory.


 This function is restricted to superusers by default, but other users
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d34182a7b0..9f4927220b 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -406,7 +406,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 
 	filename = convert_and_check_filename(filename_t);
 
-	if (stat(filename, ) < 0)
+	if (lstat(filename, ) < 0)
 	{
 		if (missing_ok && errno == ENOENT)
 			PG_RETURN_NULL();
@@ -632,7 +632,7 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 
 		/* Get the file info */
 		snprintf(path, sizeof(path), "%s/%s", dir, de->d_name);
-		if (stat(path, ) < 0)
+		if (lstat(path, ) < 0)
 		{
 			/* Ignore concurrently-deleted files, else complain */
 			if (errno == ENOENT)
-- 
2.17.0

>From a7729e28c3361fad3adc09388a59d47bdce46eb1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v25 03/11] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 24 
 src/test/regress/sql/misc_functions.sql  |  8 +++
 2 files changed, 32 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3acb98d04..edbfd9abc1 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -201,6 +201,30 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No such file or 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-12-04 Thread Tom Lane
Justin Pryzby  writes:
[ v24-0001-Document-historic-behavior-of-links-to-directori.patch ]

The cfbot is unhappy with one of the test cases you added:

6245@@ -259,9 +259,11 @@
6246 select path, filename, type from pg_ls_dir_metadata('.', true, false, 
true) where path!~'[0-9]|pg_internal.init|global.tmp' order by 1;
6247path   |   filename| type 
6248 --+---+--
6249+ PG_VERSION   | PG_VERSION| -
6250  base | base  | d
6251  base/pgsql_tmp   | pgsql_tmp | d
6252  global   | global| d
6253+ global/config_exec_params| config_exec_params| -
6254  global/pg_control| pg_control| -
6255  global/pg_filenode.map   | pg_filenode.map   | -
6256  pg_commit_ts | pg_commit_ts  | d
6257@@ -285,7 +287,6 @@
6258  pg_subtrans  | pg_subtrans   | d
6259  pg_tblspc| pg_tblspc | d
6260  pg_twophase  | pg_twophase   | d
6261- PG_VERSION   | PG_VERSION| -
6262  pg_wal   | pg_wal| d
6263  pg_wal/archive_status| archive_status| d
6264  pg_xact  | pg_xact   | d
6265@@ -293,7 +294,7 @@
6266  postgresql.conf  | postgresql.conf   | -
6267  postmaster.opts  | postmaster.opts   | -
6268  postmaster.pid   | postmaster.pid| -
6269-(34 rows)
6270+(35 rows)

This shows that (a) the test is sensitive to prevailing collation and
(b) it's not filtering out enough temporary files.  Even if those things
were fixed, though, the test would break every time we added/removed
some PGDATA substructure.  Worse, it'd also break if say somebody had
edited postgresql.conf and left an editor backup file behind, or when
running in an installation where the configuration files are someplace
else.  I think this is way too fragile to be acceptable.

Maybe it could be salvaged by reversing the sense of the WHERE condition
so that instead of trying to blacklist stuff, you whitelist just a small
number of files that should certainly be there.

regards, tom lane




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-11-29 Thread Justin Pryzby
On Mon, Nov 23, 2020 at 04:14:18PM -0500, Tom Lane wrote:
> Justin Pryzby  writes:
> >> This patch has been waiting for input from a committer on the approach I've
> >> taken with the patches since March 10, so I'm planning to set to "Ready" - 
> >> at
> >> least ready for senior review.
> 
> I took a quick look through this.  This is just MHO, of course:
> 
> * I don't think it's okay to change the existing signatures of
> pg_ls_logdir() et al.  Even if you can make an argument that it's
> not too harmful to add more output columns, replacing pg_stat_file's
> isdir output with something of a different name and datatype is most
> surely not OK --- there is no possible way that doesn't break existing
> user queries.
> 
> I think possibly a more acceptable approach is to leave these functions
> alone but add documentation explaining how to get the additional info.
> You could say things along the lines of "pg_ls_waldir() is the same as
> pg_ls_dir_metadata('pg_wal') except for showing fewer columns."
> 
> * I'm not very much on board with implementing pg_ls_dir_recurse()
> as a SQL function that depends on a WITH RECURSIVE construct.
> I do not think that's okay from either performance or security
> standpoints.  Surely it can't be hard to build a recursion capability

Thanks.  WITH RECURSIVE was the "new approach" I took early this year.  Of
course we can recurse in C, now that I know (how) to use the tuplestore.
Working on that patch was how I ran into the "LIMIT 1" SRF bug.

I don't see how security is relevant though, though, since someone can run a
the WITH query directly.  The function just needs to be restricted to
superusers, same as pg_ls_dir().

Anyway, I've re-ordered commits so this the last patch, since earlier commits
don't need to depend on it.  I don't think it's even essential to provide a
recursive function (anyone could write the CTE), so long as we don't hide dirs
and show isdir or type.

I implemented it first as a separate function and then as an optional argument
to pg_ls_dir_files().  If it's implemented as an optional "mode" of an existing
function, there's the constraint that returning a "path" argument has to be
after all other arguments (the ones that are useful without recursion) or else
it messes up other functions (like pg_ls_waldir()) that also call
pg_ls_dir_files().

> doesn't seem to have thought very carefully about the interaction
> of those two flags, ie it seems like LS_DIR_SKIP_HIDDEN effectively
> implies LS_DIR_SKIP_DOT_DIRS.  Do we want that?

Yes it's implied.  Those options exist to support the pre-existing behavior.
pg_ls_dir can optionaly show dotdirs, but pg_ls_*dir skip all hidden files
(which is documented since 8b6d94cf6).  I'm happy to implement something else
if a different behavior is desirable.

-- 
Justin
>From e313fdfab9b0f0538978eae8bd0d37c13d21453a Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v24 01/11] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 507bc1a668..9c0ad7a334 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25904,7 +25904,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From 9e0bc43162286eadbdc1548ba240eeef0f4ca451 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 18:59:16 -0500
Subject: [PATCH v24 02/11] pg_stat_file and pg_ls_dir_* to use lstat()..

pg_ls_dir_* will now skip (no longer show) symbolic links, same as other
non-regular file types, as we advertize we do since 8b6d94cf6.  That seems to
be the intented behavior, since irregular file types are 1) less portable; and,
2) we don't currently show a file's type except for "bool is_dir".

pg_stat_file will now 1) show metadata of links themselves, rather than their
target; and, 2) specifically, show links to directories with "is_dir=false";
and, 3) not error on broken symlinks.
---
 doc/src/sgml/func.sgml  | 2 +-
 src/backend/utils/adt/genfile.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 9c0ad7a334..507bc1a668 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25904,7 +25904,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-11-24 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> I took a quick look through this.  This is just MHO, of course:
> >> 
> >> * I don't think it's okay to change the existing signatures of
> >> pg_ls_logdir() et al.
> 
> > I disagree that we need to stress over this- we pretty routinely change
> > the signature of various catalogs and functions and anyone using these
> > is already of the understanding that we are free to make such changes
> > between major versions.
> 
> Well, like I said, just MHO.  Anybody else want to weigh in?
> 
> I'm mostly concerned about removing the isdir output of pg_stat_file().
> Maybe we could compromise to the extent of keeping that, allowing it
> to be partially duplicative of a file-type-code output column.

I don't have any particular issue with keeping isdir as a convenience
column.  I agree it'll now be a bit duplicative but that seems alright.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-11-23 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> I took a quick look through this.  This is just MHO, of course:
>> 
>> * I don't think it's okay to change the existing signatures of
>> pg_ls_logdir() et al.

> I disagree that we need to stress over this- we pretty routinely change
> the signature of various catalogs and functions and anyone using these
> is already of the understanding that we are free to make such changes
> between major versions.

Well, like I said, just MHO.  Anybody else want to weigh in?

I'm mostly concerned about removing the isdir output of pg_stat_file().
Maybe we could compromise to the extent of keeping that, allowing it
to be partially duplicative of a file-type-code output column.

regards, tom lane




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-11-23 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Justin Pryzby  writes:
> >> This patch has been waiting for input from a committer on the approach I've
> >> taken with the patches since March 10, so I'm planning to set to "Ready" - 
> >> at
> >> least ready for senior review.
> 
> I took a quick look through this.  This is just MHO, of course:
> 
> * I don't think it's okay to change the existing signatures of
> pg_ls_logdir() et al.  Even if you can make an argument that it's
> not too harmful to add more output columns, replacing pg_stat_file's
> isdir output with something of a different name and datatype is most
> surely not OK --- there is no possible way that doesn't break existing
> user queries.

I disagree that we need to stress over this- we pretty routinely change
the signature of various catalogs and functions and anyone using these
is already of the understanding that we are free to make such changes
between major versions.  If anything, we should be strongly discouraging
the notion of "don't break user queries" when it comes to administrative
and monitoring functions like these because, otherwise, we end up with
things like the mess that is pg_start/stop_backup() (and just contrast
that to what we did to recovery.conf when thinking about "well, do we
need to 'deprecate' or keep around the old stuff so we don't break
things for users who use these functions?" or the changes made in v10,
neither of which have produced much in the way of complaints).

Let's focus on working towards cleaner APIs and functions, accepting a
break when it makes sense to, which seems to be the case with this patch
(though I agree about using a TAP test suite and about performing the
directory recursion in C instead), and not pull forward cruft that we
then are telling ourselves we have to maintain compatibility of
indefinitely and at the expense of sensible APIs.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-11-23 Thread Tom Lane
Justin Pryzby  writes:
>> This patch has been waiting for input from a committer on the approach I've
>> taken with the patches since March 10, so I'm planning to set to "Ready" - at
>> least ready for senior review.

I took a quick look through this.  This is just MHO, of course:

* I don't think it's okay to change the existing signatures of
pg_ls_logdir() et al.  Even if you can make an argument that it's
not too harmful to add more output columns, replacing pg_stat_file's
isdir output with something of a different name and datatype is most
surely not OK --- there is no possible way that doesn't break existing
user queries.

I think possibly a more acceptable approach is to leave these functions
alone but add documentation explaining how to get the additional info.
You could say things along the lines of "pg_ls_waldir() is the same as
pg_ls_dir_metadata('pg_wal') except for showing fewer columns."

* I'm not very much on board with implementing pg_ls_dir_recurse()
as a SQL function that depends on a WITH RECURSIVE construct.
I do not think that's okay from either performance or security
standpoints.  Surely it can't be hard to build a recursion capability
into the C code?  We could then also debate whether this ought to be
a separate function at all, instead of something you invoke via an
additional boolean flag parameter to pg_ls_dir_metadata().

* I'm fairly unimpressed with the testing approach, because it doesn't
seem like you're getting very much coverage.  It's hard to do better while
still having the totally-fixed output expected by our regular regression
test framework, but to me that just says not to test these functions in
that framework.  I'd consider ripping all of that out in favor of a
TAP test.

While I didn't read the C code in any detail, a couple of things stood
out to me:

* I noticed that you did s/stat/lstat/.  That's fine on Unix systems,
but it won't have any effect on Windows systems (cf bed90759f),
which means that we'll have to document a platform-specific behavioral
difference.  Do we want to go there?  Maybe this patch needs to wait
on somebody fixing our lack of real lstat() on Windows.  (I assume BTW
that this means the WIN32 code in get_file_type() is unreachable.)

* This bit:

+   /* Skip dot dirs? */
+   if (flags & LS_DIR_SKIP_DOT_DIRS &&
+   (strcmp(de->d_name, ".") == 0 ||
+strcmp(de->d_name, "..") == 0))
+   continue;
+
+   /* Skip hidden files? */
+   if (flags & LS_DIR_SKIP_HIDDEN &&
+   de->d_name[0] == '.')
continue;

doesn't seem to have thought very carefully about the interaction
of those two flags, ie it seems like LS_DIR_SKIP_HIDDEN effectively
implies LS_DIR_SKIP_DOT_DIRS.  Do we want that?

regards, tom lane




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-11-05 Thread Justin Pryzby
On Wed, Oct 28, 2020 at 02:34:02PM -0500, Justin Pryzby wrote:
> On Tue, Sep 08, 2020 at 02:51:26PM -0500, Justin Pryzby wrote:
> > On Sat, Jul 18, 2020 at 03:15:32PM -0500, Justin Pryzby wrote:
> > > Still waiting for feedback from a committer.
> > 
> > This patch has been waiting for input from a committer on the approach I've
> > taken with the patches since March 10, so I'm planning to set to "Ready" - 
> > at
> > least ready for senior review.
> 
> @cfbot: rebased

Rebased on e152506adef4bc503ea7b8ebb4fedc0b8eebda81

-- 
Justin
>From 3e98e1e782f46d3b4ae8240d8cbb608af80bc9f8 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v23 01/10] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7b1dc264f6..f068260188 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25908,7 +25908,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From 9cef25af7c265e040def5cf89286ec068edca075 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 18:59:16 -0500
Subject: [PATCH v23 02/10] pg_stat_file and pg_ls_dir_* to use lstat()..

pg_ls_dir_* will now skip (no longer show) symbolic links, same as other
non-regular file types, as we advertize we do since 8b6d94cf6.  That seems to
be the intented behavior, since irregular file types are 1) less portable; and,
2) we don't currently show a file's type except for "bool is_dir".

pg_stat_file will now 1) show metadata of links themselves, rather than their
target; and, 2) specifically, show links to directories with "is_dir=false";
and, 3) not error on broken symlinks.
---
 doc/src/sgml/func.sgml  | 2 +-
 src/backend/utils/adt/genfile.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index f068260188..7b1dc264f6 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25908,7 +25908,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory (or a symbolic link to a directory).
+indicating if it is a directory.


 This function is restricted to superusers by default, but other users
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d34182a7b0..9f4927220b 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -406,7 +406,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 
 	filename = convert_and_check_filename(filename_t);
 
-	if (stat(filename, ) < 0)
+	if (lstat(filename, ) < 0)
 	{
 		if (missing_ok && errno == ENOENT)
 			PG_RETURN_NULL();
@@ -632,7 +632,7 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 
 		/* Get the file info */
 		snprintf(path, sizeof(path), "%s/%s", dir, de->d_name);
-		if (stat(path, ) < 0)
+		if (lstat(path, ) < 0)
 		{
 			/* Ignore concurrently-deleted files, else complain */
 			if (errno == ENOENT)
-- 
2.17.0

>From 3b72b3e0941bd10e796744019adea3570fc64063 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v23 03/10] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 18 ++
 src/test/regress/sql/misc_functions.sql  |  5 +
 2 files changed, 23 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3acb98d04..2e87c548eb 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -201,6 +201,24 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-10-28 Thread Justin Pryzby
On Tue, Sep 08, 2020 at 02:51:26PM -0500, Justin Pryzby wrote:
> On Sat, Jul 18, 2020 at 03:15:32PM -0500, Justin Pryzby wrote:
> > Still waiting for feedback from a committer.
> 
> This patch has been waiting for input from a committer on the approach I've
> taken with the patches since March 10, so I'm planning to set to "Ready" - at
> least ready for senior review.

@cfbot: rebased
>From 2b56825516d4107e16946b4e25084e5cdf85ba18 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v22 01/10] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7ef2ec9972..1b67ef4be8 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25917,7 +25917,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From b50a250527cd4979f58d3f7af23ea69cbd200a0e Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 18:59:16 -0500
Subject: [PATCH v22 02/10] pg_stat_file and pg_ls_dir_* to use lstat()..

pg_ls_dir_* will now skip (no longer show) symbolic links, same as other
non-regular file types, as we advertize we do since 8b6d94cf6.  That seems to
be the intented behavior, since irregular file types are 1) less portable; and,
2) we don't currently show a file's type except for "bool is_dir".

pg_stat_file will now 1) show metadata of links themselves, rather than their
target; and, 2) specifically, show links to directories with "is_dir=false";
and, 3) not error on broken symlinks.
---
 doc/src/sgml/func.sgml  | 2 +-
 src/backend/utils/adt/genfile.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1b67ef4be8..7ef2ec9972 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25917,7 +25917,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory (or a symbolic link to a directory).
+indicating if it is a directory.


 This function is restricted to superusers by default, but other users
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d34182a7b0..9f4927220b 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -406,7 +406,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 
 	filename = convert_and_check_filename(filename_t);
 
-	if (stat(filename, ) < 0)
+	if (lstat(filename, ) < 0)
 	{
 		if (missing_ok && errno == ENOENT)
 			PG_RETURN_NULL();
@@ -632,7 +632,7 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 
 		/* Get the file info */
 		snprintf(path, sizeof(path), "%s/%s", dir, de->d_name);
-		if (stat(path, ) < 0)
+		if (lstat(path, ) < 0)
 		{
 			/* Ignore concurrently-deleted files, else complain */
 			if (errno == ENOENT)
-- 
2.17.0

>From 55f30341dfef59ca3015535b9ae62a7a40be083b Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v22 03/10] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 18 ++
 src/test/regress/sql/misc_functions.sql  |  5 +
 2 files changed, 23 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3acb98d04..2e87c548eb 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -201,6 +201,24 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No such file or directory
 --
 -- Test adding a support function to a subject function
 --

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-09-08 Thread Justin Pryzby
On Sat, Jul 18, 2020 at 03:15:32PM -0500, Justin Pryzby wrote:
> Still waiting for feedback from a committer.

This patch has been waiting for input from a committer on the approach I've
taken with the patches since March 10, so I'm planning to set to "Ready" - at
least ready for senior review.

-- 
Justin




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-07-18 Thread Justin Pryzby
On Tue, Jul 14, 2020 at 10:08:39PM -0500, Justin Pryzby wrote:
> I'm still missing feedback from committers about the foundation of this
> approach.

Now rebased on top of fix for my own bug report (1d09fb1f).

I also changed argument handling for pg_ls_dir_recurse().

Passing '.' gave an initial path of . (of course) but then every other path
begins with './' which I didn't like, since it's ambiguous with empty path, or
.// or ././ ...  And one could pass './' which gives different output (like
././).  

So I specially handled the input of '.'.  Maybe the special value should be
NULL instead of ''.  But it looks no other system functions are currently
non-strict.

For pg_rewind testcase, getting the output path+filename uses a coalesce, since
the rest of the test does stuff like strcmp("pg_wal").

Still waiting for feedback from a committer.

-- 
Justin
>From e8d75d56c4e3b54f1af0ffbcdeddb46872dc308b Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v21 01/10] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 959f6a1c2f..7ef8c7a847 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25911,7 +25911,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From 8f9d4dbfd13faa22446dd5ae95f62aa10b82d6e1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 18:59:16 -0500
Subject: [PATCH v21 02/10] pg_stat_file and pg_ls_dir_* to use lstat()..

pg_ls_dir_* will now skip (no longer show) symbolic links, same as other
non-regular file types, as we advertize we do since 8b6d94cf6.  That seems to
be the intented behavior, since irregular file types are 1) less portable; and,
2) we don't currently show a file's type except for "bool is_dir".

pg_stat_file will now 1) show metadata of links themselves, rather than their
target; and, 2) specifically, show links to directories with "is_dir=false";
and, 3) not error on broken symlinks.
---
 doc/src/sgml/func.sgml  | 2 +-
 src/backend/utils/adt/genfile.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7ef8c7a847..959f6a1c2f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25911,7 +25911,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory (or a symbolic link to a directory).
+indicating if it is a directory.


 This function is restricted to superusers by default, but other users
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d34182a7b0..9f4927220b 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -406,7 +406,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 
 	filename = convert_and_check_filename(filename_t);
 
-	if (stat(filename, ) < 0)
+	if (lstat(filename, ) < 0)
 	{
 		if (missing_ok && errno == ENOENT)
 			PG_RETURN_NULL();
@@ -632,7 +632,7 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 
 		/* Get the file info */
 		snprintf(path, sizeof(path), "%s/%s", dir, de->d_name);
-		if (stat(path, ) < 0)
+		if (lstat(path, ) < 0)
 		{
 			/* Ignore concurrently-deleted files, else complain */
 			if (errno == ENOENT)
-- 
2.17.0

>From 15f4638e42b496ac6b4064f439dd6a2ea862bf69 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v21 03/10] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 18 ++
 src/test/regress/sql/misc_functions.sql  |  5 +
 2 files changed, 23 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3acb98d04..2e87c548eb 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -201,6 +201,24 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-07-14 Thread Justin Pryzby
On Sun, Jun 21, 2020 at 08:53:25PM -0500, Justin Pryzby wrote:
> I'm still waiting to hear feedback from a commiter if this is a good idea to
> put this into the system catalog.  Right now, ts_debug is the only nontrivial
> function.

I'm still missing feedback from committers about the foundation of this
approach.

But I finally looked into the pg_rewind test failure 

That led met to keep the "dir" as a separate column, since that's what's needed
there, and it's more convenient to have a separate column than to provide a
column needing to be parsed.

-- 
Justin
>From 29846d21841a76b7ebfbf5dfbeef36963bff545b Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v20 01/10] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 959f6a1c2f..7ef8c7a847 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25911,7 +25911,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From 971d6165400ee85fb0ec4c19ad36a8e1146cc382 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 18:59:16 -0500
Subject: [PATCH v20 02/10] pg_stat_file and pg_ls_dir_* to use lstat()..

pg_ls_dir_* will now skip (no longer show) symbolic links, same as other
non-regular file types, as we advertize we do since 8b6d94cf6.  That seems to
be the intented behavior, since irregular file types are 1) less portable; and,
2) we don't currently show a file's type except for "bool is_dir".

pg_stat_file will now 1) show metadata of links themselves, rather than their
target; and, 2) specifically, show links to directories with "is_dir=false";
and, 3) not error on broken symlinks.
---
 doc/src/sgml/func.sgml  | 2 +-
 src/backend/utils/adt/genfile.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7ef8c7a847..959f6a1c2f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25911,7 +25911,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory (or a symbolic link to a directory).
+indicating if it is a directory.


 This function is restricted to superusers by default, but other users
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c1cc19d1f5..46d0977c2e 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -406,7 +406,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 
 	filename = convert_and_check_filename(filename_t);
 
-	if (stat(filename, ) < 0)
+	if (lstat(filename, ) < 0)
 	{
 		if (missing_ok && errno == ENOENT)
 			PG_RETURN_NULL();
@@ -632,7 +632,7 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 
 		/* Get the file info */
 		snprintf(path, sizeof(path), "%s/%s", dir, de->d_name);
-		if (stat(path, ) < 0)
+		if (lstat(path, ) < 0)
 		{
 			/* Ignore concurrently-deleted files, else complain */
 			if (errno == ENOENT)
-- 
2.17.0

>From 75c95441077969bcb6d0151de4a9224697442a23 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v20 03/10] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 18 ++
 src/test/regress/sql/misc_functions.sql  |  5 +
 2 files changed, 23 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3acb98d04..2e87c548eb 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -201,6 +201,24 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-06-21 Thread Justin Pryzby
On Sun, Jun 07, 2020 at 10:07:19AM +0200, Fabien COELHO wrote:
> Hello Justin,
> > Rebased onto 7b48f1b490978a8abca61e9a9380f8de2a56f266 and renumbered OIDs.

Rebased again on whatever broke func.sgml.

> pg_stat_file() and pg_stat_dir_files() now return a char type, as well as
> the function which call them, but the documentation does not seem to say
> that it is the case.

Fixed, thanks

> I must admit that I'm not a fan on the argument management of
> pg_ls_dir_metadata and pg_ls_dir_metadata_1arg and others. I understand that
> it saves a few lines though, so maybe let it be.

I think you're saying that you don't like the _1arg functions, but they're
needed to allow the regression tests to pass:

| * note: this wrapper is necessary to pass the sanity check in opr_sanity,
| * which checks that all built-in functions that share the implementing C
| * function take the same number of arguments

> There is a comment in pg_ls_dir_files which talks about pg_ls_dir.
> 
> Could pg_ls_*dir functions C implementations be dropped in favor of a pure
> SQL implementation, like you did with recurse?

I'm still waiting to hear feedback from a commiter if this is a good idea to
put this into the system catalog.  Right now, ts_debug is the only nontrivial
function.

> If so, ISTM that pg_ls_dir_files() could be significantly simplified by
> moving its filtering flag to SQL conditions on "type" and others. That could
> allow not to change the existing function output a keep the "isdir" column
> defined as "type = 'd'" where it was used previously, if someone complains,
> but still have the full capability of "ls". That would also allow to drop
> the "*_1arg" hacks. Basically I'm advocating having 1 or 2 actual C
> functions, and all other variants managed at the SQL level.

You want to get rid of the 1arg stuff and just have one function.

I see your point, but I guess the C function would still need to accept a
"missing_ok" argument, so we need two functions, so there's not much utility in
getting rid of the "include_dot_dirs" argument, which is there for consistency
with pg_ls_dir.

Conceivably we could 1) get rid of pg_ls_dir, and 2) get rid of the
include_dot_dirs argument and 3) maybe make "missing_ok" a required argument;
and, 4) get rid of the C wrapper functions, and replace with a bunch of stuff
like this:

SELECT name, size, access, modification, change, creation, type='d' AS isdir
FROM pg_ls_dir_metadata('pg_wal') WHERE substring(name,1,1)!='.' AND type!='d';

Where the defaults I changed in this patchset still remain to be discussed:
with or without metadata, hidden files, dotdirs.

As I'm still waiting for committer feedback on the first 10 patches, so not
intending to add more.

-- 
Justin
>From 60593b397f03ec840a97f8004d97177dec463752 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v19 01/10] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index b7c450ea29..9f47745c5a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25881,7 +25881,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From c1acf58316869e176bc2b215e2545f9183f100e3 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 18:59:16 -0500
Subject: [PATCH v19 02/10] pg_stat_file and pg_ls_dir_* to use lstat()..

pg_ls_dir_* will now skip (no longer show) symbolic links, same as other
non-regular file types, as we advertize we do since 8b6d94cf6.  That seems to
be the intented behavior, since irregular file types are 1) less portable; and,
2) we don't currently show a file's type except for "bool is_dir".

pg_stat_file will now 1) show metadata of links themselves, rather than their
target; and, 2) specifically, show links to directories with "is_dir=false";
and, 3) not error on broken symlinks.
---
 doc/src/sgml/func.sgml  | 2 +-
 src/backend/utils/adt/genfile.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 9f47745c5a..b7c450ea29 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25881,7 +25881,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-06-07 Thread Fabien COELHO



Hello Justin,


Rebased onto 7b48f1b490978a8abca61e9a9380f8de2a56f266 and renumbered OIDs.


Some feedback about v18, seen as one patch.

Patch applies cleanly & compiles. "make check" is okay.

pg_stat_file() and pg_stat_dir_files() now return a char type, as well as 
the function which call them, but the documentation does not seem to say 
that it is the case.


I must admit that I'm not a fan on the argument management of 
pg_ls_dir_metadata and pg_ls_dir_metadata_1arg and others. I understand 
that it saves a few lines though, so maybe let it be.


There is a comment in pg_ls_dir_files which talks about pg_ls_dir.

Could pg_ls_*dir functions C implementations be dropped in favor of a pure 
SQL implementation, like you did with recurse?


If so, ISTM that pg_ls_dir_files() could be significantly simplified by 
moving its filtering flag to SQL conditions on "type" and others. That 
could allow not to change the existing function output a keep the "isdir" 
column defined as "type = 'd'" where it was used previously, if someone 
complains, but still have the full capability of "ls". That would also 
allow to drop the "*_1arg" hacks. Basically I'm advocating having 1 or 2 
actual C functions, and all other variants managed at the SQL level.


--
Fabien.




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-05-25 Thread Justin Pryzby
Rebased onto 7b48f1b490978a8abca61e9a9380f8de2a56f266 and renumbered OIDs.
>From bb41ae268041b7e7771930d533a8ca20a00805c7 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v18 01/10] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7c06afd3ea..11f5ef9372 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25861,7 +25861,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From af4275ac21bf6699b5265b4d80f02de7a8e6cb9f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 18:59:16 -0500
Subject: [PATCH v18 02/10] pg_stat_file and pg_ls_dir_* to use lstat()..

pg_ls_dir_* will now skip (no longer show) symbolic links, same as other
non-regular file types, as we advertize we do since 8b6d94cf6.  That seems to
be the intented behavior, since irregular file types are 1) less portable; and,
2) we don't currently show a file's type except for "bool is_dir".

pg_stat_file will now 1) show metadata of links themselves, rather than their
target; and, 2) specifically, show links to directories with "is_dir=false";
and, 3) not error on broken symlinks.
---
 doc/src/sgml/func.sgml  | 2 +-
 src/backend/utils/adt/genfile.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 11f5ef9372..7c06afd3ea 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25861,7 +25861,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory (or a symbolic link to a directory).
+indicating if it is a directory.


 This function is restricted to superusers by default, but other users
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index ceaa6180da..219ac160f8 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -370,7 +370,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 
 	filename = convert_and_check_filename(filename_t);
 
-	if (stat(filename, ) < 0)
+	if (lstat(filename, ) < 0)
 	{
 		if (missing_ok && errno == ENOENT)
 			PG_RETURN_NULL();
@@ -596,7 +596,7 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 
 		/* Get the file info */
 		snprintf(path, sizeof(path), "%s/%s", dir, de->d_name);
-		if (stat(path, ) < 0)
+		if (lstat(path, ) < 0)
 		{
 			/* Ignore concurrently-deleted files, else complain */
 			if (errno == ENOENT)
-- 
2.17.0

>From c122fc6e1d80beac33478a52fd7bdc6835fde317 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v18 03/10] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 18 ++
 src/test/regress/sql/misc_functions.sql  |  5 +
 2 files changed, 23 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3acb98d04..2e87c548eb 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -201,6 +201,24 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No such file or directory
 --
 -- Test adding a support function to a subject function
 --
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 094e8f8296..f6857ad177 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -60,6 +60,11 @@ select count(*) > 0 from
where spcname = 'pg_default') pts
   join 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-05-07 Thread Justin Pryzby
Rebased onto 1ad23335f36b07f4574906a8dc66a3d62af7c40c
>From 698026f6365a324bf52c5985d549f71b52ada567 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v17 01/10] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index d9b3598977..e0d1eff6b5 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25861,7 +25861,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From ecd5654cb1b22254387d3584bbc28cb1af046a62 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 18:59:16 -0500
Subject: [PATCH v17 02/10] pg_stat_file and pg_ls_dir_* to use lstat()..

pg_ls_dir_* will now skip (no longer show) symbolic links, same as other
non-regular file types, as we advertize we do since 8b6d94cf6.  That seems to
be the intented behavior, since irregular file types are 1) less portable; and,
2) we don't currently show a file's type except for "bool is_dir".

pg_stat_file will now 1) show metadata of links themselves, rather than their
target; and, 2) specifically, show links to directories with "is_dir=false";
and, 3) not error on broken symlinks.
---
 doc/src/sgml/func.sgml  | 2 +-
 src/backend/utils/adt/genfile.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e0d1eff6b5..d9b3598977 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25861,7 +25861,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory (or a symbolic link to a directory).
+indicating if it is a directory.


 This function is restricted to superusers by default, but other users
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index ceaa6180da..219ac160f8 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -370,7 +370,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 
 	filename = convert_and_check_filename(filename_t);
 
-	if (stat(filename, ) < 0)
+	if (lstat(filename, ) < 0)
 	{
 		if (missing_ok && errno == ENOENT)
 			PG_RETURN_NULL();
@@ -596,7 +596,7 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 
 		/* Get the file info */
 		snprintf(path, sizeof(path), "%s/%s", dir, de->d_name);
-		if (stat(path, ) < 0)
+		if (lstat(path, ) < 0)
 		{
 			/* Ignore concurrently-deleted files, else complain */
 			if (errno == ENOENT)
-- 
2.17.0

>From c5b407dd179019a9b79c29f921f1f2366c7d52ed Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v17 03/10] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 18 ++
 src/test/regress/sql/misc_functions.sql  |  5 +
 2 files changed, 23 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3acb98d04..2e87c548eb 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -201,6 +201,24 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No such file or directory
 --
 -- Test adding a support function to a subject function
 --
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 094e8f8296..f6857ad177 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -60,6 +60,11 @@ select count(*) > 0 from
where spcname = 'pg_default') pts
   join pg_database db on 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-05-02 Thread Justin Pryzby
On Sun, Apr 12, 2020 at 01:53:40PM +0200, Fabien COELHO wrote:
> About v15, seen as one patch.

Thanks for looking.

> - I'm wondering whether could pg_stat_file call pg_ls_dir_files without
>   too much effort? ISTM that the output structure nearly the same. I do
>   not like much having one function specialized for files and one for
>   directories.

I refactored but not like that.  As I mentioned in the commit message, I don't
see a good way to make a function operate on a file when the function's primary
data structure is a DIR*.  Do you ?  I don't think it should call stat() and
then conditionally branch off to pg_stat_file().

There are two functions because they wrap two separate syscalls, which see as
good, transparent goal.  If we want a function that does what "ls -al" does,
that would also be a good example to follow, except that we already didn't
follow it.

/bin/ls first stat()s the path, and then either outputs its metadata (if it's a
file or if -d was specified) or lists a dir.  It's essentially a wrapper around
*two* system calls (stat and readdir/getdents).

Maybe we could invent a new pg_ls() which does that, and then refactor existing
code.  Or, maybe it would be a SQL function which calls stat() and then
conditionally calls pg_ls_dir if isdir=True (or type='d').  That would be easy
if we merge the commit which outputs all stat fields.

I'm still hoping for confirmation from a committer if this approach is worth
pursuing:

https://www.postgresql.org/message-id/20200310183037.GA29065%40telsasoft.com
https://www.postgresql.org/message-id/20200313131232.GO29065%40telsasoft.com
|Rather than making "recurse into tmpdir" the end goal:
|
|  - add a function to show metadata of an arbitrary dir;
|  - add isdir arguments to pg_ls_* functions (including pg_ls_tmpdir but not
|pg_ls_dir).
|  - maybe add pg_ls_dir_recurse, which satisfies the original need;
|  - retire pg_ls_dir (does this work with tuplestore?)
|  - profit

-- 
Justin
>From 6175cecd312296bbb9099834d91ffaa50e059f6c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v16 01/10] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 96ea57eedd..9b885102da 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25486,7 +25486,8 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 size, last accessed time stamp, last modified time stamp,
 last file status change time stamp (Unix platforms only),
 file creation time stamp (Windows only), and a boolean
-indicating if it is a directory.  Typical usages include:
+indicating if it is a directory (or a symbolic link to a directory).
+Typical usages include:
 
 SELECT * FROM pg_stat_file('filename');
 SELECT (pg_stat_file('filename')).modification;
-- 
2.17.0

>From 52fd41bfacaba66f26f1e75b34e258b3a0f12372 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 18:59:16 -0500
Subject: [PATCH v16 02/10] pg_stat_file and pg_ls_dir_* to use lstat()..

pg_ls_dir_* will now skip (no longer show) symbolic links, same as other
non-regular file types, as we advertize we do since 8b6d94cf6.  That seems to
be the intented behavior, since irregular file types are 1) less portable; and,
2) we don't currently show a file's type except for "bool is_dir".

pg_stat_file will now 1) show metadata of links themselves, rather than their
target; and, 2) specifically, show links to directories with "is_dir=false";
and, 3) not error on broken symlinks.
---
 doc/src/sgml/func.sgml  | 2 +-
 src/backend/utils/adt/genfile.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 9b885102da..96b08d0500 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25486,7 +25486,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 size, last accessed time stamp, last modified time stamp,
 last file status change time stamp (Unix platforms only),
 file creation time stamp (Windows only), and a boolean
-indicating if it is a directory (or a symbolic link to a directory).
+indicating if it is a directory.
 Typical usages include:
 
 SELECT * FROM pg_stat_file('filename');
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index ceaa6180da..219ac160f8 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -370,7 +370,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 
 	filename = convert_and_check_filename(filename_t);
 
-	if (stat(filename, ) < 0)
+	if (lstat(filename, ) < 0)
 	{
 		if (missing_ok && errno == ENOENT)
 			PG_RETURN_NULL();
@@ -596,7 +596,7 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 
 		/* Get the 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-04-12 Thread Fabien COELHO



Hello Justin,

About v15, seen as one patch.

Patches serie applies cleanly, compiles, "make check" ok.

Documentation:
 - indent documentation text around 80 cols, as done around?
 - indent SQL example for readability and capitalize keywords
   (pg_ls_dir_metadata)
 - "For each file in a directory, list the file and its metadata."
   maybe: "List files and their metadata in a directory"?

Code:
 - Most pg_ls_*dir* functions call pg_ls_dir_files(), which looks like
   reasonable refactoring, ISTM that the code is actually smaller.
 - please follow pg style, eg not "} else {"
 - there is a "XXX" (meaning fixme?) tag remaining in a comment.
 - file types: why not do block & character devices, fifo and socket
   as well, before the unkown case?
 - I'm wondering whether could pg_stat_file call pg_ls_dir_files without
   too much effort? ISTM that the output structure nearly the same. I do
   not like much having one function specialized for files and one for
   directories.

Tests:
 - good, there are some!
 - indent SQL code, eg by starting a new line on new clauses?
 - put comments on separate lines (I'm not against it on principle, I do
   that, but I do not think that it is done much in test files).

--
Fabien.




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-31 Thread Justin Pryzby
On Tue, Mar 17, 2020 at 02:04:01PM -0500, Justin Pryzby wrote:
> > The example in the documentation could be better indented. Also, ISTM that
> > there are two implicit laterals (format & pg_ls_dir_recurse) that I would
> > make explicit. I'd use the pcs alias explicitely. I'd use meaningful aliases
> > (eg ts instead of b, …).
> 
> > On reflection, I think that a boolean "isdir" column is a bad idea because
> > it is not extensible. I'd propose to switch to the standard "ls" approach of
> > providing the type as one character: '-' for regular, 'd' for directory, 'l'
> > for link, 's' for socket, 'c' for character special…
> 
> I think that's outside the scope of the patch, since I'd want to change
> pg_stat_file; that's where I borrowed "isdir" from, for consistency.
> 
> Note that both LS_DIR_HISTORIC and LS_DIR_MODERN include LS_DIR_SKIP_SPECIAL,
> so only pg_ls_dir itself show specials, so they way to do it would be to 1)
> change pg_stat_file to expose the file's "type", 2) use pg_ls_dir() AS a,
> lateral pg_stat_file(a) AS b, 3) then consider also changing LS_DIR_MODERN and
> all the existing pg_ls_*.

The patch intends to fix the issue of "failing to show failed filesets"
(because dirs are skipped) while also generalizing existing functions (to show
directories and "isdir" column) and providing some more flexible ones (to list
file and metadata of a dir, which is currently possible [only] for "special"
directories, or by recursively calling pg_stat_file).

I'm still of the opinion that supporting arbitrary file types is out of scope,
but I changed the "isdir" to show "type".  I'm only supporting '[-dl]'.  I
don't want to have to check #ifdef S_ISDOOR or whatever other vendors have.  I
insist that it is a separate patch, since it depends on everything else, and I
have no feedback from anybody else as to whether any of that is desired.

template1=# SELECT * FROM pg_ls_waldir();
   name   |   size   | access |  
modification  | change | creation | type 
--+--++++--+--
 barr |0 | 2020-03-31 14:43:11-05 | 2020-03-31 
14:43:11-05 | 2020-03-31 14:43:11-05 |  | ?
 baz  | 4096 | 2020-03-31 14:39:18-05 | 2020-03-31 
14:39:18-05 | 2020-03-31 14:39:18-05 |  | d
 foo  |0 | 2020-03-31 14:39:37-05 | 2020-03-31 
14:39:37-05 | 2020-03-31 14:39:37-05 |  | -
 archive_status   | 4096 | 2020-03-31 14:38:20-05 | 2020-03-31 
14:38:18-05 | 2020-03-31 14:38:18-05 |  | d
 00010001 | 16777216 | 2020-03-31 14:42:53-05 | 2020-03-31 
14:43:08-05 | 2020-03-31 14:43:08-05 |  | -
 bar  |3 | 2020-03-31 14:39:16-05 | 2020-03-31 
14:39:01-05 | 2020-03-31 14:39:01-05 |  | l

>From f6b54482cc1d1b9595f49211b14807a20f994a3f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v15 01/10] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7a0bb0c70a..25b1278459 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21558,7 +21558,8 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 size, last accessed time stamp, last modified time stamp,
 last file status change time stamp (Unix platforms only),
 file creation time stamp (Windows only), and a boolean
-indicating if it is a directory.  Typical usages include:
+indicating if it is a directory (or a symbolic link to a directory).
+Typical usages include:
 
 SELECT * FROM pg_stat_file('filename');
 SELECT (pg_stat_file('filename')).modification;
-- 
2.17.0

>From e1c8750fb777df4bf41f36df7ae6cde39b907ec2 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 18:59:16 -0500
Subject: [PATCH v15 02/10] pg_stat_file and pg_ls_dir_* to use lstat()..

pg_ls_dir_* will now skip (no longer show) symbolic links, same as other
non-regular file types, as we advertize we do since 8b6d94cf6.  That seems to
be the intented behavior, since irregular file types are 1) less portable; and,
2) we don't currently show a file's type except for "bool is_dir".

pg_stat_file will now 1) show metadata of links themselves, rather than their
target; and, 2) specifically, show links to directories with "is_dir=false";
and, 3) not error on broken symlinks.
---
 doc/src/sgml/func.sgml  | 2 +-
 src/backend/utils/adt/genfile.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 25b1278459..630f03d7dd 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21558,7 +21558,7 @@ 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-17 Thread Tom Lane
Justin Pryzby  writes:
> It seems like the only way to make variable number of arguments is is with
> multiple entries in pg_proc.dat, one for each "number of" arguments.  Is that
> right ?

Another way to do it is to have one entry, put the full set of arguments
into the initial pg_proc.dat data, and then use CREATE OR REPLACE FUNCTION
later during initdb to install some defaults.  See existing cases in
system_views.sql, starting about line 1180.  Neither way is especially
pretty, so take your choice.

regards, tom lane




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-17 Thread Justin Pryzby
On Tue, Mar 17, 2020 at 10:21:48AM +0100, Fabien COELHO wrote:
> 
> About v13, seens as one patch:
> 
> Function "pg_ls_dir_metadata" documentation suggests a variable number of
> arguments with brackets, but parameters are really mandatory.

Fixed, and added tests on 1 and 3 arg versions of both pg_ls_dir() and
pg_ls_dir_metadata().

It seems like the only way to make variable number of arguments is is with
multiple entries in pg_proc.dat, one for each "number of" arguments.  Is that
right ?

> The example in the documentation could be better indented. Also, ISTM that
> there are two implicit laterals (format & pg_ls_dir_recurse) that I would
> make explicit. I'd use the pcs alias explicitely. I'd use meaningful aliases
> (eg ts instead of b, …).

> On reflection, I think that a boolean "isdir" column is a bad idea because
> it is not extensible. I'd propose to switch to the standard "ls" approach of
> providing the type as one character: '-' for regular, 'd' for directory, 'l'
> for link, 's' for socket, 'c' for character special…

I think that's outside the scope of the patch, since I'd want to change
pg_stat_file; that's where I borrowed "isdir" from, for consistency.

Note that both LS_DIR_HISTORIC and LS_DIR_MODERN include LS_DIR_SKIP_SPECIAL,
so only pg_ls_dir itself show specials, so they way to do it would be to 1)
change pg_stat_file to expose the file's "type", 2) use pg_ls_dir() AS a,
lateral pg_stat_file(a) AS b, 3) then consider also changing LS_DIR_MODERN and
all the existing pg_ls_*.

> ISTM that "lstat" is not available on windows, which suggests to call "stat"
> always, and then "lstat" on un*x and pg ports stuff on win.

I believe that's handled here.
src/include/port/win32_port.h:#define lstat(path, sb) stat(path, sb)

> I'm wondering about the restriction on directories only. Why should it not
> work on a file? Can it be easily extended to work on a simple file? If so,
> it could be just "pg_ls".

I think that's a good idea, except it doesn't fit with what the code does:
AllocDir() and ReadDir().  Instead, use pg_stat_file() for that.

Hm, I realized that the existing pg_ls_dir_metadata was skipping links to dirs,
since !ISREG().  So changed to use both stat() and lstat().

-- 
Justin
>From d8294c4747c5ba1f3bec858c137cc2d31e5a0425 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v14 1/8] Document historic behavior of links to directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index fc4d7f0f78..2c6142a0e0 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21528,7 +21528,8 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 size, last accessed time stamp, last modified time stamp,
 last file status change time stamp (Unix platforms only),
 file creation time stamp (Windows only), and a boolean
-indicating if it is a directory.  Typical usages include:
+indicating if it is a directory (or a symbolic link to a directory).
+Typical usages include:
 
 SELECT * FROM pg_stat_file('filename');
 SELECT (pg_stat_file('filename')).modification;
-- 
2.17.0

>From 031ce2684587d6d8d84c6d4b6c87f5399b6efb78 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v14 2/8] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 18 ++
 src/test/regress/sql/misc_functions.sql  |  5 +
 2 files changed, 23 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3acb98d04..2e87c548eb 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -201,6 +201,24 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No such file or directory
 --
 -- Test adding a support function to a subject function
 --
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 094e8f8296..f6857ad177 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -60,6 +60,11 @@ select count(*) > 0 from
where spcname = 'pg_default') pts
   join pg_database db on pts.pts = db.oid;
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-17 Thread Fabien COELHO


About v13, seens as one patch:

Function "pg_ls_dir_metadata" documentation suggests a variable number of 
arguments with brackets, but parameters are really mandatory.


 postgres=# SELECT pg_ls_dir_metadata('.');
 ERROR:  function pg_ls_dir_metadata(unknown) does not exist
 LINE 1: SELECT pg_ls_dir_metadata('.');
   ^
 HINT:  No function matches the given name and argument types. You might need 
to add explicit type casts.
 postgres=# SELECT pg_ls_dir_metadata('.', true, true);
 …

The example in the documentation could be better indented. Also, ISTM that 
there are two implicit laterals (format & pg_ls_dir_recurse) that I would 
make explicit. I'd use the pcs alias explicitely. I'd use meaningful 
aliases (eg ts instead of b, …).


On reflection, I think that a boolean "isdir" column is a bad idea because 
it is not extensible. I'd propose to switch to the standard "ls" approach 
of providing the type as one character: '-' for regular, 'd' for 
directory, 'l' for link, 's' for socket, 'c' for character special…


ISTM that "lstat" is not available on windows, which suggests to call 
"stat" always, and then "lstat" on un*x and pg ports stuff on win.


I'm wondering about the restriction on directories only. Why should it not 
work on a file? Can it be easily extended to work on a simple file? If so, 
it could be just "pg_ls".


--
Fabien.

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-16 Thread Justin Pryzby
On Mon, Mar 16, 2020 at 07:17:36PM -0300, Alvaro Herrera wrote:
> I pushed 0001 and 0003 (as a single commit).  archive_statusdir didn't
> get here until 12, so your commit message was mistaken.  Also, pg10 is
> slightly different so it didn't apply there, so I left it alone.

Thanks, I appreciate it (and I'm sure Fabien will appreciate having two fewer
patches...).

@cfbot: rebased onto b4570d33aa045df330bb325ba8a2cbf02266a555

I realized that if I lstat() a file to make sure links to dirs show as
isdir=false, it's odd to then show size and timestamps of the dir.  So changed
to use lstat ... and squished.

-- 
Justin
>From d8294c4747c5ba1f3bec858c137cc2d31e5a0425 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v13 1/8] Document historic behavior of links to directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index fc4d7f0f78..2c6142a0e0 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21528,7 +21528,8 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 size, last accessed time stamp, last modified time stamp,
 last file status change time stamp (Unix platforms only),
 file creation time stamp (Windows only), and a boolean
-indicating if it is a directory.  Typical usages include:
+indicating if it is a directory (or a symbolic link to a directory).
+Typical usages include:
 
 SELECT * FROM pg_stat_file('filename');
 SELECT (pg_stat_file('filename')).modification;
-- 
2.17.0

>From 27913ef889696ef2c42ed52bf7097e4a5aeaad9e Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 9 Mar 2020 22:40:24 -0500
Subject: [PATCH v13 2/8] Add tests exercising pg_ls_tmpdir..

..and its backing function pg_ls_dir_files
---
 src/test/regress/expected/misc_functions.out | 7 +++
 src/test/regress/input/tablespace.source | 5 +
 src/test/regress/output/tablespace.source| 8 
 src/test/regress/sql/misc_functions.sql  | 4 
 4 files changed, 24 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3acb98d04..903f1fe443 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -201,6 +201,13 @@ select count(*) > 0 from
  t
 (1 row)
 
+-- This tests the missing_ok parameter, which causes pg_ls_tmpdir to succeed even if the tmpdir doesn't exist yet
+-- The name='' condition is never true, so the function runs to completion but returns zero rows.
+select * from pg_ls_tmpdir() where name='Does not exist';
+ name | size | modification 
+--+--+--
+(0 rows)
+
 --
 -- Test adding a support function to a subject function
 --
diff --git a/src/test/regress/input/tablespace.source b/src/test/regress/input/tablespace.source
index a5f61a35dc..0b9cfe615e 100644
--- a/src/test/regress/input/tablespace.source
+++ b/src/test/regress/input/tablespace.source
@@ -11,6 +11,11 @@ DROP TABLESPACE regress_tblspacewith;
 -- create a tablespace we can use
 CREATE TABLESPACE regress_tblspace LOCATION '@testtablespace@';
 
+-- This tests the missing_ok parameter, which causes pg_ls_tmpdir to succeed even if the tmpdir doesn't exist yet
+-- The name='' condition is never true, so the function runs to completion but returns zero rows.
+-- The query is written to ERROR if the tablespace doesn't exist, rather than silently failing to call pg_ls_tmpdir()
+SELECT c.* FROM (SELECT oid FROM pg_tablespace b WHERE b.spcname='regress_tblspace' UNION SELECT 0 ORDER BY 1 DESC LIMIT 1) AS b , pg_ls_tmpdir(oid) AS c WHERE c.name='Does not exist';
+
 -- try setting and resetting some properties for the new tablespace
 ALTER TABLESPACE regress_tblspace SET (random_page_cost = 1.0, seq_page_cost = 1.1);
 ALTER TABLESPACE regress_tblspace SET (some_nonexistent_parameter = true);  -- fail
diff --git a/src/test/regress/output/tablespace.source b/src/test/regress/output/tablespace.source
index 162b591b31..a42714bf40 100644
--- a/src/test/regress/output/tablespace.source
+++ b/src/test/regress/output/tablespace.source
@@ -13,6 +13,14 @@ SELECT spcoptions FROM pg_tablespace WHERE spcname = 'regress_tblspacewith';
 DROP TABLESPACE regress_tblspacewith;
 -- create a tablespace we can use
 CREATE TABLESPACE regress_tblspace LOCATION '@testtablespace@';
+-- This tests the missing_ok parameter, which causes pg_ls_tmpdir to succeed even if the tmpdir doesn't exist yet
+-- The name='' condition is never true, so the function runs to completion but returns zero rows.
+-- The query is written to ERROR if the tablespace doesn't exist, rather than silently failing to call pg_ls_tmpdir()
+SELECT c.* FROM (SELECT oid FROM pg_tablespace b WHERE b.spcname='regress_tblspace' UNION SELECT 0 ORDER BY 1 DESC LIMIT 1) AS b , pg_ls_tmpdir(oid) AS c WHERE c.name='Does not 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-16 Thread Alvaro Herrera
I pushed 0001 and 0003 (as a single commit).  archive_statusdir didn't
get here until 12, so your commit message was mistaken.  Also, pg10 is
slightly different so it didn't apply there, so I left it alone.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-16 Thread Justin Pryzby
On Mon, Mar 16, 2020 at 04:20:21PM +0100, Fabien COELHO wrote:
> This probably means using lstat instead of (in supplement to?) stat, and
> probably tell if something is a link, and if so not recurse in them.

On Mon, Mar 16, 2020 at 07:21:06PM +0100, Fabien COELHO wrote:
> IMHO, I really think that it should be included. Dealing with links is no
> big deal, but you need an additional column in _metadata to tell it is a
> link

Instead of showing another column, I changed to show links with isdir=false.
At the cost of two more patches, to allow backpatching docs and maybe separate
commit to make the subtle change obvious in commit history, at least.

I see a few places in the backend and a few more in the fronted using the same
logic that I used for islink(), but I'm not sure if there's a good place to put
that to allow factoring out at least the other backend ones.

-- 
Justin
>From 1bb8e0efb4f14fa344cd5ee66c3138184a9fa9e2 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 6 Mar 2020 16:50:07 -0600
Subject: [PATCH v12 01/11] Document historic behavior about hiding directories
 and special files

Should backpatch to v10: tmpdir, waldir and archive_statusdir
---
 doc/src/sgml/func.sgml | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 323366feb6..4c0ea5ab3f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21450,6 +21450,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 (mtime) of each file in the log directory. By default, only superusers
 and members of the pg_monitor role can use this function.
 Access may be granted to others using GRANT.
+Filenames beginning with a dot, directories, and other special files are not shown.

 

@@ -21461,6 +21462,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 default only superusers and members of the pg_monitor role
 can use this function. Access may be granted to others using
 GRANT.
+Filenames beginning with a dot, directories, and other special files are not shown.

 

@@ -21473,6 +21475,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 superusers and members of the pg_monitor role can
 use this function. Access may be granted to others using
 GRANT.
+Filenames beginning with a dot, directories, and other special files are not shown.

 

-- 
2.17.0

>From b6eb6cd9299e8ae07770790f05da038fede5278c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v12 02/11] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4c0ea5ab3f..ace95fe661 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21527,7 +21527,8 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 size, last accessed time stamp, last modified time stamp,
 last file status change time stamp (Unix platforms only),
 file creation time stamp (Windows only), and a boolean
-indicating if it is a directory.  Typical usages include:
+indicating if it is a directory (or a symbolic link to a directory).
+Typical usages include:
 
 SELECT * FROM pg_stat_file('filename');
 SELECT (pg_stat_file('filename')).modification;
-- 
2.17.0

>From 52f5b76cc278c43983b044b7fb0f04c2d848d61e Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 6 Mar 2020 17:12:04 -0600
Subject: [PATCH v12 03/11] Document historic behavior about hiding directories
 and special files

Should backpatch to v12: tmpdir
---
 doc/src/sgml/func.sgml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index ace95fe661..2c6142a0e0 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21489,6 +21489,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 default only superusers and members of the pg_monitor
 role can use this function.  Access may be granted to others using
 GRANT.
+Filenames beginning with a dot, directories, and other special files are not shown.

 

-- 
2.17.0

>From dd98e2b4f1fb37065a317719f0dd493e8837c7e7 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 9 Mar 2020 22:40:24 -0500
Subject: [PATCH v12 04/11] Add tests exercizing pg_ls_tmpdir..

..and its backing function pg_ls_dir_files
---
 src/test/regress/expected/misc_functions.out | 7 +++
 src/test/regress/input/tablespace.source | 5 +
 src/test/regress/output/tablespace.source| 8 
 src/test/regress/sql/misc_functions.sql  | 4 
 4 files changed, 24 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index e217b678d7..9947d9ef9d 100644
--- 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-16 Thread Fabien COELHO



Hello Justin,


 psql> SELECT * FROM pg_ls_dir_recurse('.');
 ERROR:  could not stat file 
"./base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo":
 Too many levels of symbolic links
 CONTEXT:  SQL function "pg_ls_dir_recurse" statement 1

This probably means using lstat instead of (in supplement to?) stat, and
probably tell if something is a link, and if so not recurse in them.


Thanks for looking.

I think that opens up a can of worms.  I don't want to go into the business of
re-implementing all of find(1) - I count ~128 flags (most of which take
arguments).  You're referring to find -L vs find -P, and some people would want
one and some would want another.  And don't forget about find -H...


This is not the point. The point is that a link can change a finite tree 
into cyclic graph, and you do not want to delve into that, ever.


The "find" command, by default, does not recurse into a link because of 
said problem, and the user *must* ask for it and assume the infinite loop 
if any.


So if you implement one behavior, it should be not recursing into links. 
Franckly, I would not provide the recurse into link alternative, but it 
could be implemented if someone wants it, and the problem that come with 
it.



pg_stat_file doesn't expose the file type (I guess because it's not portable?),


You are right that Un*x and Windows are not the same wrt link. It seems 
that there is already something about that in port:


  "./src/port/dirmod.c:pgwin32_is_junction(const char *path)"

So most of the details are already hidden.


and I think it's outside the scope of this patch to change that.  Maybe it
suggests that the pg_ls_dir_recurse patch should be excluded.


IMHO, I really think that it should be included. Dealing with links is no 
big deal, but you need an additional column in _metadata to tell it is a 
link, and there is a ifdef because testing is a little different between 
unix and windows. I'd guess around 10-20 lines of code added.



ISTM if someone wants to recursively list a directory, they should avoid
putting cycles there, or permission errors, or similar.


Hmmm. I'd say the user should like to be able to call the function and 
never have a bad experience with it such as a failure on an infinite loop.


Or they should write their own C extension that borrows from 
pg_ls_dir_files but handles more arguments.


ISTM that the point of your patch is to provide the basic tool needed to 
list directories contents, and handling links somehow is a necessary part 
of that.


--
Fabien.




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-16 Thread Justin Pryzby
On Mon, Mar 16, 2020 at 04:20:21PM +0100, Fabien COELHO wrote:
> 
> About v11, ISTM that the recursive function should check for symbolic links
> and possibly avoid them:
> 
>  sh> cd data/base
>  sh> ln -s .. foo
> 
>  psql> SELECT * FROM pg_ls_dir_recurse('.');
>  ERROR:  could not stat file 
> "./base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo":
>  Too many levels of symbolic links
>  CONTEXT:  SQL function "pg_ls_dir_recurse" statement 1
> 
> This probably means using lstat instead of (in supplement to?) stat, and
> probably tell if something is a link, and if so not recurse in them.

Thanks for looking.

I think that opens up a can of worms.  I don't want to go into the business of
re-implementing all of find(1) - I count ~128 flags (most of which take
arguments).  You're referring to find -L vs find -P, and some people would want
one and some would want another.  And don't forget about find -H...

pg_stat_file doesn't expose the file type (I guess because it's not portable?),
and I think it's outside the scope of this patch to change that.  Maybe it
suggests that the pg_ls_dir_recurse patch should be excluded.

ISTM if someone wants to recursively list a directory, they should avoid
putting cycles there, or permission errors, or similar.  Or they should write
their own C extension that borrows from pg_ls_dir_files but handles more
arguments.

-- 
Justin




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-16 Thread Fabien COELHO



About v11, ISTM that the recursive function should check for symbolic 
links and possibly avoid them:


 sh> cd data/base
 sh> ln -s .. foo

 psql> SELECT * FROM pg_ls_dir_recurse('.');
 ERROR:  could not stat file 
"./base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo/base/foo":
 Too many levels of symbolic links
 CONTEXT:  SQL function "pg_ls_dir_recurse" statement 1

This probably means using lstat instead of (in supplement to?) stat, and 
probably tell if something is a link, and if so not recurse in them.


--
Fabien.




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-15 Thread Justin Pryzby
On Sun, Mar 15, 2020 at 06:15:02PM +0100, Fabien COELHO wrote:
> Some feedback on v10:

Thanks for looking.  I'm hoping to hear from Alvaro what he thinks of this
approach (all functions to show isdir, rather than one function which lists
recursively).

> All patches apply cleanly, one on top of the previous. I really wish there
> would be less than 9 patches…

I kept them separate to allow the earlier patches to be applied.
And intended to make easier to review, even if it's more work for me..

If you mean that it's a pain to apply 9 patches, I will suggest to use:
|git am -3 ./mailbox
where ./mailbox is either a copy of the mail you received, or retrieved from
the web interface.

To test that each one works (compiles, passes tests, etc), I use git rebase -i
HEAD~11 and "e"edit the target (set of) patches.

> * v10.1 doc change: ok
> 
> * v10.2 doc change: ok, not sure why it is not merged with previous

As I mentioned, separate since I'm proposing that they're backpatched to
different releases.  Those could be applied now (and Tom already applied a
patch identical to 0001 in a prior patchset).

> * v10.3 test add: could be merge with both previous

> Tests seem a little contrived. I'm wondering whether something more
> straightforward could be proposed. For instance, once the tablespace is just
> created but not used yet, probably we do know that the tmp file exists and
> is empty?

The tmpdir *doesn't* exist until someone creates tmpfiles there.
As it mentions:
+-- This tests the missing_ok parameter, which causes pg_ls_tmpdir to succeed 
even if the tmpdir doesn't exist yet

> * v10.4 at least, some code!
> I'm not sure of the "FLAG_" prefix which seems too generic, even if it is
> local. I'd suggest "LS_DIR_*", maybe, as a more specific prefix.

Done.

> ISTM that Pg style requires spaces around operators. I'd suggest some
> parenthesis would help as well, eg: "flags_MISSING_OK" -> "(flags &
> FLAG_MISSING_OK)" and other instances.

Partially took your suggestion.

> About:
> 
>  if (S_ISDIR(attrib.st_mode)) {
>if (flags_SKIP_DIRS)
>  continue;
>   }
> 
> and similars, why not the simpler:
> 
>  if (S_ISDIR(attrib.st_mode) && (flags & FLAG_SKIP_DIRS))
> continue;

That's not the same - if SKIP_DIRS isn't set, your way would fail that test for
dirs, and then hit the !ISREG test, and skip them anyway.
|else if (!S_ISREG(attrib.st_mode))
|   continue

> Maybe I'd create defines for long and common flag specs, eg:
> 
>  #define ..._LS_SIMPLE 
> (FLAG_SKIP_DIRS|FLAG_SKIP_HIDDEN|FLAG_SKIP_SPECIAL|FLAG_METADATA)

Done

> I'm not sure about these asserts:
> 
>/* isdir depends on metadata */
>Assert(!(flags_ISDIR) || (flags_METADATA));
> 
> Hmmm. Why?

It's not supported to show isdir without showing metadata (because that case
isn't needed to support the old and the new behaviors).

+   if (flags & FLAG_METADATA)
+   {
+   values[1] = Int64GetDatum((int64) attrib.st_size);
+   values[2] = 
TimestampTzGetDatum(time_t_to_timestamptz(attrib.st_mtime));
+   if (flags & FLAG_ISDIR)
+   values[3] = 
BoolGetDatum(S_ISDIR(attrib.st_mode));
+   }

>/* Unreasonable to show isdir and skip dirs */
>Assert(!(flags_ISDIR) || !(flags_SKIP_DIRS));
> 
> Hmmm. Why would I prevent that, even if it has little sense, it should work.
> I do not see having false on the isdir column as an actual issue.

It's unimportant, but testing for intended use of flags during development.

> * v10.6 behavior change for existing functions, always show isdir column,
> and removal of IS_DIR flag.
> 
> I'm unsure why the features are removed, some use case may benefit from the
> more complete function?
> Maybe flags defs should not be changed anyway?

Maybe.  I put them back...but it means they're not being exercized by any
*used* case.

> I do not like much the "if (...) /* empty */;" code. Maybe it could be
> caught more cleanly later in the conditional structure.

This went away when I put back the SKIP_DIRS flag.

> * v10.7 adds "pg_ls_dir_recurse" function

> Doc looks incomplete and the example is very contrived and badly indented.

Why you think it's contrived?  Listing a tmpdir recursively is the initial
motivation of this patch.  Maybe you think I should list just the tmpdir for
one tablespace ?  Note that for temp_tablespaces parameter:

|When there is more than one name in the list, PostgreSQL chooses a random 
member of the list each time a temporary object is to be created; except that 
within a transaction, successively created temporary objects are placed in 
successive tablespaces from the list.

> The function definition does not follow the style around: uppercase whereas
> all others are lowercase, "" instead of '', no "as"…

I used "" because of this:
| x.name||'/'||a.name
I don't know if there's a better way to join paths in SQL, or if that suggests
this is 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-15 Thread Fabien COELHO


Hello Justin,

Some feedback on v10:

All patches apply cleanly, one on top of the previous. I really wish there 
would be less than 9 patches…


* v10.1 doc change: ok

* v10.2 doc change: ok, not sure why it is not merged with previous

* v10.3 test add: could be merge with both previous

Tests seem a little contrived. I'm wondering whether something more 
straightforward could be proposed. For instance, once the tablespace is 
just created but not used yet, probably we do know that the tmp file 
exists and is empty?


* v10.4 at least, some code!

Compiles, make check ok.

pg_ls_dir_files: I'm fine with the flag approach given the number of 
switches and the internal nature of the function.


I'm not sure of the "FLAG_" prefix which seems too generic, even if it is 
local. I'd suggest "LS_DIR_*", maybe, as a more specific prefix.


ISTM that Pg style requires spaces around operators. I'd suggest some 
parenthesis would help as well, eg: "flags_MISSING_OK" -> "(flags & 
FLAG_MISSING_OK)" and other instances.


About:

 if (S_ISDIR(attrib.st_mode)) {
   if (flags_SKIP_DIRS)
 continue;
  }

and similars, why not the simpler:

 if (S_ISDIR(attrib.st_mode) && (flags & FLAG_SKIP_DIRS))
continue;

Especially that it is done like that in previous cases.

Maybe I'd create defines for long and common flag specs, eg:

 #define ..._LS_SIMPLE 
(FLAG_SKIP_DIRS|FLAG_SKIP_HIDDEN|FLAG_SKIP_SPECIAL|FLAG_METADATA)

No attempt at recursing.

I'm not sure about these asserts:

   /* isdir depends on metadata */
   Assert(!(flags_ISDIR) || (flags_METADATA));

Hmmm. Why?

   /* Unreasonable to show isdir and skip dirs */
   Assert(!(flags_ISDIR) || !(flags_SKIP_DIRS));

Hmmm. Why would I prevent that, even if it has little sense, it should 
work. I do not see having false on the isdir column as an actual issue.


* v10.5 add is_dir column, a few tests & doc.

Ok.

* v10.6 behavior change for existing functions, always show isdir column,
and removal of IS_DIR flag.

I'm unsure why the features are removed, some use case may benefit from 
the more complete function?


Maybe flags defs should not be changed anyway?

I do not like much the "if (...) /* empty */;" code. Maybe it could be 
caught more cleanly later in the conditional structure.


* v10.7 adds "pg_ls_dir_recurse" function

Using sql recurse to possibly to implement the feature is pretty elegant
and limits open directories to one at a time, which is pretty neat.

Doc looks incomplete and the example is very contrived and badly indented.

The function definition does not follow the style around: uppercase 
whereas all others are lowercase, "" instead of '', no "as"…


I do not understand why oid 8511 is given to the new function.

I do not understand why UNION ALL and not UNION.

I would have put the definition after "pg_ls_dir_metadata" definition.

pg_ls_dir_metadata seems defined as (text,bool,bool) but called as 
(text,bool,bool,bool).


Maybe a better alias could be given instead of x?

There are no tests for the new function. I'm not sure it would work.

* v10.8 change flags & add test on pg_ls_logdir().

I'm unsure why it is done at this stage.

* v10.9 change some ls functions and fix patch 10.7 issue

I'm unsure why it is done at this stage. "make check" ok.

--
Fabien.

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-13 Thread Justin Pryzby
@cfbot: rebased onto 085b6b6679e73b9b386f209b4d625c7bc60597c0

The merge conflict presents another opportunity to solicit comments on the new
approach.  Rather than making "recurse into tmpdir" the end goal:

  - add a function to show metadata of an arbitrary dir;
  - add isdir arguments to pg_ls_* functions (including pg_ls_tmpdir but not
pg_ls_dir).
  - maybe add pg_ls_dir_recurse, which satisfies the original need;
  - retire pg_ls_dir (does this work with tuplestore?)
  - profit

The alternative seems to be to go back to Alvaro's earlier proposal:
 - not only add "isdir", but also recurse;

I think I would insist on adding a general function to recurse into any dir.
And *optionally* change ps_ls_* to recurse (either by accepting an argument, or
by making that a separate patch to debate).  tuplestore is certainly better
than keeping a stack/List of DIRs for this.

On Tue, Mar 10, 2020 at 01:30:37PM -0500, Justin Pryzby wrote:
> I took a step back, and I wondered whether we should add a generic function 
> for
> listing a dir with metadata, possibly instead of changing the existing
> functions.  Then one could do pg_ls_dir_metadata('pg_wal',false,false);
> 
> Since pg8.1, we have pg_ls_dir() to show a list of files.  Since pg10, we've
> had pg_ls_logdir and pg_ls_waldir, which show not only file names but also
> (some) metadata (size, mtime).  And since pg12, we've had pg_ls_tmpfile and
> pg_ls_archive_statusdir, which also show metadata.
> 
> ...but there's no a function which lists the metadata of an directory other
> than tmp, wal, log.
> 
> One can do this:
> |SELECT b.*, c.* FROM (SELECT 'base' a)a, LATERAL (SELECT 
> a||'/'||pg_ls_dir(a.a)b)b, pg_stat_file(b)c;
> ..but that's not as helpful as allowing:
> |SELECT * FROM pg_ls_dir_metadata('.',true,true);
> 
> There's also no function which recurses into an arbitrary directory, so it
> seems shortsighted to provide a function to recursively list a tmpdir.
> 
> Also, since pg_ls_dir_metadata indicates whether the path is a dir, one can
> write a SQL function to show the dir recursively.  It'd be trivial to plug in
> wal/log/tmp (it seems like tmpdirs of other tablespace's are not entirely
> trivial).
> |SELECT * FROM pg_ls_dir_recurse('base/pgsql_tmp');
> 
> Also, on a neighboring thread[1], Tom indicated that the pg_ls_* functions
> should enumerate all files during the initial call, which sounds like a bad
> idea when recursively showing directories.  If we add a function recursing 
> into
> a directory, we'd need to discuss all the flags to expose to it, like recurse,
> ignore_errors, one_filesystem?, show_dotfiles (and eventually bikeshed all the
> rest of the flags in find(1)).
> 
> My initial patch [2] changed ls_tmpdir to show metadata columns including
> is_dir, but not decend.  It's pretty unfortunate if a function called
> pg_ls_tmpdir hides shared filesets, so maybe it really is best to change that
> (it's new in v12).
> 
> I'm interested to in feedback on the alternative approach, as attached.  The
> final patch to include all the rest of columns shown by pg_stat_file() is more
> of an idea/proposal and not sure if it'll be desirable.  But pg_ls_tmpdir() is
> essentially the same as my v1 patch.
> 
> This is intended to be mostly independent of any fix to the WARNING I reported
> [1].  Since my patch collapses pg_ls_dir into pg_ls_dir_files, we'd only need
> to fix one place.  I'm planning to eventually look into Tom's suggestion of
> returning tuplestore to fix that, and maybe rebase this patchset on top of
> that.
> 
> -- 
> Justin
> 
> [1] 
> https://www.postgresql.org/message-id/flat/20200308173103.GC1357%40telsasoft.com
> [2] 
> https://www.postgresql.org/message-id/20191214224735.GA28433%40telsasoft.com
>From 950eed8812a167a12da49553f7e3a2d1438d779b Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 6 Mar 2020 16:50:07 -0600
Subject: [PATCH v10 1/9] Document historic behavior about hiding directories
 and special files

Should backpatch to v10: tmpdir, waldir and archive_statusdir
---
 doc/src/sgml/func.sgml | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 323366feb6..4c0ea5ab3f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21450,6 +21450,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 (mtime) of each file in the log directory. By default, only superusers
 and members of the pg_monitor role can use this function.
 Access may be granted to others using GRANT.
+Filenames beginning with a dot, directories, and other special files are not shown.

 

@@ -21461,6 +21462,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 default only superusers and members of the pg_monitor role
 can use this function. Access may be granted to others using
 GRANT.
+Filenames beginning with a dot, directories, and other special files are not shown.

 

@@ -21473,6 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-10 Thread Justin Pryzby
I took a step back, and I wondered whether we should add a generic function for
listing a dir with metadata, possibly instead of changing the existing
functions.  Then one could do pg_ls_dir_metadata('pg_wal',false,false);

Since pg8.1, we have pg_ls_dir() to show a list of files.  Since pg10, we've
had pg_ls_logdir and pg_ls_waldir, which show not only file names but also
(some) metadata (size, mtime).  And since pg12, we've had pg_ls_tmpfile and
pg_ls_archive_statusdir, which also show metadata.

...but there's no a function which lists the metadata of an directory other
than tmp, wal, log.

One can do this:
|SELECT b.*, c.* FROM (SELECT 'base' a)a, LATERAL (SELECT 
a||'/'||pg_ls_dir(a.a)b)b, pg_stat_file(b)c;
..but that's not as helpful as allowing:
|SELECT * FROM pg_ls_dir_metadata('.',true,true);

There's also no function which recurses into an arbitrary directory, so it
seems shortsighted to provide a function to recursively list a tmpdir.

Also, since pg_ls_dir_metadata indicates whether the path is a dir, one can
write a SQL function to show the dir recursively.  It'd be trivial to plug in
wal/log/tmp (it seems like tmpdirs of other tablespace's are not entirely
trivial).
|SELECT * FROM pg_ls_dir_recurse('base/pgsql_tmp');

Also, on a neighboring thread[1], Tom indicated that the pg_ls_* functions
should enumerate all files during the initial call, which sounds like a bad
idea when recursively showing directories.  If we add a function recursing into
a directory, we'd need to discuss all the flags to expose to it, like recurse,
ignore_errors, one_filesystem?, show_dotfiles (and eventually bikeshed all the
rest of the flags in find(1)).

My initial patch [2] changed ls_tmpdir to show metadata columns including
is_dir, but not decend.  It's pretty unfortunate if a function called
pg_ls_tmpdir hides shared filesets, so maybe it really is best to change that
(it's new in v12).

I'm interested to in feedback on the alternative approach, as attached.  The
final patch to include all the rest of columns shown by pg_stat_file() is more
of an idea/proposal and not sure if it'll be desirable.  But pg_ls_tmpdir() is
essentially the same as my v1 patch.

This is intended to be mostly independent of any fix to the WARNING I reported
[1].  Since my patch collapses pg_ls_dir into pg_ls_dir_files, we'd only need
to fix one place.  I'm planning to eventually look into Tom's suggestion of
returning tuplestore to fix that, and maybe rebase this patchset on top of
that.

-- 
Justin

[1] 
https://www.postgresql.org/message-id/flat/20200308173103.GC1357%40telsasoft.com
[2] https://www.postgresql.org/message-id/20191214224735.GA28433%40telsasoft.com
>From 2c4b2c408490ecde3cfb4e336a78942f7a6f8197 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 27 Dec 2019 23:34:14 -0600
Subject: [PATCH v9 01/11] BUG: in errmsg

Note there's two changes here.
Should backpatch to v12, where pg_ls_tmpdir was added.
---
 src/backend/utils/adt/genfile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index 3741b87486..897b11a77d 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -590,7 +590,7 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 		if (stat(path, ) < 0)
 			ereport(ERROR,
 	(errcode_for_file_access(),
-	 errmsg("could not stat directory \"%s\": %m", dir)));
+	 errmsg("could not stat file \"%s\": %m", path)));
 
 		/* Ignore anything but regular files */
 		if (!S_ISREG(attrib.st_mode))
-- 
2.17.0

>From f3ef0c6ff664f2f26e95ce97e8b50a813bd1aab8 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 6 Mar 2020 16:50:07 -0600
Subject: [PATCH v9 02/11] Document historic behavior about hiding directories
 and special files

Should backpatch to v10: tmpdir, waldir and archive_statusdir
---
 doc/src/sgml/func.sgml | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 323366feb6..4c0ea5ab3f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21450,6 +21450,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 (mtime) of each file in the log directory. By default, only superusers
 and members of the pg_monitor role can use this function.
 Access may be granted to others using GRANT.
+Filenames beginning with a dot, directories, and other special files are not shown.

 

@@ -21461,6 +21462,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 default only superusers and members of the pg_monitor role
 can use this function. Access may be granted to others using
 GRANT.
+Filenames beginning with a dot, directories, and other special files are not shown.

 

@@ -21473,6 +21475,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 superusers and members of the pg_monitor role can
 use this 

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-08 Thread Fabien COELHO



Hello Justin,

Patch series applies cleanly. The last status compiles and passes "make 
check". A few more comments:


* v8.[123] ok.

* v8.4

Avoid using the type name as a field name? "enum dir_action dir_action;" 
-> "enum dir_action action", or maybe rename "dir_action" enum 
"dir_action_t".


About pg_ls_dir:

"if (!fctx->dirdesc)" I do not think that is true even if AllocateDir 
failed, the list exists anyway. ISTM it should be linitial which is NULL 
in that case.


Given the overlap between pg_ls_dir and pg_ls_dir_files, ISTM that the 
former should call the slightly extended later with appropriate flags.


About populate_paths:

function is a little bit strange to me, ISTM it would deserve more 
comments.


I'm not sure the name reflect what it does. For instance, ISTM that it 
does one thing, but the name is plural. Maybe "move_to_next_path" or 
"update_current_path" or something?


It returns an int which can only be 0 or 1, which smells like an bool. 
What is this int/bool is not told in the function head comment. I guess it 
is whether the path was updated. When it returns false, the list length is 
down to one.


Shouldn't AllocateDir be tested for bad result? Maybe it is a dir but you 
do not have perms to open it? Or give a comment about why it cannot 
happen?


later, good, at least the function is called, even if it is only for an 
error case. Maybe some non empty coverage tests could be added with a 
"count(*) > 0" on not is_dir or maybe "count(*) = 0" on is_dir, for 
instance?


  (SELECT oid FROM pg_tablespace b WHERE b.spcname='regress_tblspace'
  UNION SELECT 0 ORDER BY 1 DESC LIMIT 1)b

The 'b' glued to the ')' looks pretty strange. I'd suggest ") AS b". 
Reusing the same alias twice could be avoided for clarity, maybe.


* v8.[56]

I'd say that a behavior change which adds a column and a possibly a few 
rows is ok, especially as the tmpdir contains subdirs now.


--
Fabien.




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-03-07 Thread Justin Pryzby
On Sat, Mar 07, 2020 at 03:14:37PM +0100, Fabien COELHO wrote:
> The documentation sentences could probably be improved "for for", "used ...
> used". Maybe:

> ISTM that several instances of: "pg_ls_dir_files(..., true, false);" should
> be "pg_ls_dir_files(..., true, DIR_HIDE);".

> Alas, ISTM that there are no tests on any of these functions:-(

Addressed these.

And reordered the last two commits to demonstrate and exercize the behavior
change in regress test.

-- 
Justin
>From a5b9a03445d1c768662cafebd8ab3bd7a62890aa Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 27 Dec 2019 23:34:14 -0600
Subject: [PATCH v8 1/6] BUG: in errmsg

Note there's two changes here.
Should backpatch to v12, where pg_ls_tmpdir was added.
---
 src/backend/utils/adt/genfile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index 3741b87486..897b11a77d 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -590,7 +590,7 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 		if (stat(path, ) < 0)
 			ereport(ERROR,
 	(errcode_for_file_access(),
-	 errmsg("could not stat directory \"%s\": %m", dir)));
+	 errmsg("could not stat file \"%s\": %m", path)));
 
 		/* Ignore anything but regular files */
 		if (!S_ISREG(attrib.st_mode))
-- 
2.17.0

>From 6ea85ec0a267930320b8454a33bca368a8544a2d Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 6 Mar 2020 16:50:07 -0600
Subject: [PATCH v8 2/6] Document historic behavior about hiding directories
 and special files

Should backpatch to v10: tmpdir, waldir and archive_statusdir
---
 doc/src/sgml/func.sgml | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 323366feb6..4c0ea5ab3f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21450,6 +21450,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 (mtime) of each file in the log directory. By default, only superusers
 and members of the pg_monitor role can use this function.
 Access may be granted to others using GRANT.
+Filenames beginning with a dot, directories, and other special files are not shown.

 

@@ -21461,6 +21462,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 default only superusers and members of the pg_monitor role
 can use this function. Access may be granted to others using
 GRANT.
+Filenames beginning with a dot, directories, and other special files are not shown.

 

@@ -21473,6 +21475,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 superusers and members of the pg_monitor role can
 use this function. Access may be granted to others using
 GRANT.
+Filenames beginning with a dot, directories, and other special files are not shown.

 

-- 
2.17.0

>From 5250d637493627f1ff3587bc73dd598bc1ca3ffc Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 6 Mar 2020 17:12:04 -0600
Subject: [PATCH v8 3/6] Document historic behavior about hiding directories
 and special files

Should backpatch to v12: tmpdir
---
 doc/src/sgml/func.sgml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4c0ea5ab3f..fc4d7f0f78 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21489,6 +21489,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 default only superusers and members of the pg_monitor
 role can use this function.  Access may be granted to others using
 GRANT.
+Filenames beginning with a dot, directories, and other special files are not shown.

 

-- 
2.17.0

>From 70183f1ba1eb33e5c279a6d22a56fcaebdbfbb97 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 14 Dec 2019 16:22:15 -0600
Subject: [PATCH v8 4/6] pg_ls_tmpdir to show directories recursively

See also 9cd92d1a33699f86aa53d44ab04cc3eb50c18d11

Discussion
https://www.postgresql.org/message-id/flat/20191213053931.gv2...@telsasoft.com
https://www.postgresql.org/message-id/flat/20191227170220.ge12...@telsasoft.com

Need catversion bump
---
 doc/src/sgml/func.sgml|  14 +-
 src/backend/utils/adt/genfile.c   | 167 --
 src/include/catalog/pg_proc.dat   |   8 +-
 src/test/regress/input/tablespace.source  |   5 +
 src/test/regress/output/tablespace.source |   8 ++
 5 files changed, 147 insertions(+), 55 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index fc4d7f0f78..234c9c1699 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21382,8 +21382,9 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());

setof record

-List the name, size, and last modification time of files in the
-temporary directory for tablespace.  If
+For the