Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-02-03 Thread Fujii Masao
On Fri, Jan 31, 2014 at 9:29 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Jan 28, 2014 at 5:51 PM, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Jan 28, 2014 at 6:11 AM, Amit Kapila amit.kapil...@gmail.com
 wrote:

 On Tue, Jan 28, 2014 at 9:26 AM, Fujii Masao masao.fu...@gmail.com
 wrote:
  Hi,
 
  The files in pg_stat_tmp directory don't need to be backed up because
  they are
  basically reset at the archive recovery. So I think it's worth
  changing pg_basebackup
  so that it skips any files in pg_stat_tmp directory. Thought?

 I think this is good idea, but can't it also avoid
 PGSTAT_STAT_PERMANENT_TMPFILE along with temp files in
 pg_stat_tmp


 All stats files should be excluded. IIRC the PGSTAT_STAT_PERMANENT_TMPFILE
 refers to just the global one. You want to exclude based on
 PGSTAT_STAT_PERMANENT_DIRECTORY (and of course based on the guc
 stats_temp_directory if it's in PGDATA.

 Attached patch changes basebackup.c so that it skips all files in both
 pg_stat_tmp
 and stats_temp_directory. Even when a user sets stats_temp_directory
 to the directory
 other than pg_stat_tmp, we need to skip the files in pg_stat_tmp. Because,
 per recent change of pg_stat_statements, the external query file is
 always created there.

Committed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-02-02 Thread Fujii Masao
On Sat, Feb 1, 2014 at 2:08 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jan 31, 2014 at 8:40 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Yeah, I was thinking that, too. I'm not sure whether including log files
 in backup really increases the security risk, though. There are already
 very important data, i.e., database, in backups. Anyway, since
 the amount of log files can be very large and they are not essential
 for recovery, it's worth considering whether to exclude them. OTOH,
 I'm sure that some users prefer current behavior for some reasons.
 So I think that it's better to expose the pg_basebackup option
 specifying whether log files are included in backups or not.

 I don't really see how this can work reliably; pg_log isn't a
 hard-coded name, but rather a GUC that users can leave set to that
 value or set to any other value they choose.

I'm thinking to change basebackup.c so that it compares the
name of the directory that it's trying to back up and the setting
value of log_directory parameter, then, if they are the same,
it just skips the directory. The patch that I sent upthread does
this regarding stats_temp_directory.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-02-02 Thread Peter Eisentraut
On 2/2/14, 10:23 AM, Fujii Masao wrote:
 I'm thinking to change basebackup.c so that it compares the
 name of the directory that it's trying to back up and the setting
 value of log_directory parameter, then, if they are the same,
 it just skips the directory. The patch that I sent upthread does
 this regarding stats_temp_directory.

I'm undecided on whether log files should be copied, but in case we
decide not to, it needs to be considered whether we at least recreate
the pg_log directory on the standby.  Otherwise weird things will happen
when you start the standby, and it would introduce an extra fixup step
to sort that out.

Extra credit for doing something useful when pg_log is a symlink.

I fear, however, that if you end up implementing all that logic, it
would become too much special magic.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-02-02 Thread Fujii Masao
On Mon, Feb 3, 2014 at 6:57 AM, Peter Eisentraut pete...@gmx.net wrote:
 On 2/2/14, 10:23 AM, Fujii Masao wrote:
 I'm thinking to change basebackup.c so that it compares the
 name of the directory that it's trying to back up and the setting
 value of log_directory parameter, then, if they are the same,
 it just skips the directory. The patch that I sent upthread does
 this regarding stats_temp_directory.

 I'm undecided on whether log files should be copied, but in case we
 decide not to, it needs to be considered whether we at least recreate
 the pg_log directory on the standby.  Otherwise weird things will happen
 when you start the standby, and it would introduce an extra fixup step
 to sort that out.

Yes, basically we should skip only files under pg_log, but not pg_log
directory itself.

 Extra credit for doing something useful when pg_log is a symlink.

 I fear, however, that if you end up implementing all that logic, it
 would become too much special magic.

ISTM that pg_xlog has already been handled in that way by basebackup.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-01-31 Thread Fujii Masao
On Tue, Jan 28, 2014 at 5:51 PM, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Jan 28, 2014 at 6:11 AM, Amit Kapila amit.kapil...@gmail.com
 wrote:

 On Tue, Jan 28, 2014 at 9:26 AM, Fujii Masao masao.fu...@gmail.com
 wrote:
  Hi,
 
  The files in pg_stat_tmp directory don't need to be backed up because
  they are
  basically reset at the archive recovery. So I think it's worth
  changing pg_basebackup
  so that it skips any files in pg_stat_tmp directory. Thought?

 I think this is good idea, but can't it also avoid
 PGSTAT_STAT_PERMANENT_TMPFILE along with temp files in
 pg_stat_tmp


 All stats files should be excluded. IIRC the PGSTAT_STAT_PERMANENT_TMPFILE
 refers to just the global one. You want to exclude based on
 PGSTAT_STAT_PERMANENT_DIRECTORY (and of course based on the guc
 stats_temp_directory if it's in PGDATA.

Attached patch changes basebackup.c so that it skips all files in both
pg_stat_tmp
and stats_temp_directory. Even when a user sets stats_temp_directory
to the directory
other than pg_stat_tmp, we need to skip the files in pg_stat_tmp. Because,
per recent change of pg_stat_statements, the external query file is
always created there.

Regards,

-- 
Fujii Masao
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
***
*** 68,73 
--- 68,74 
  #include parser/analyze.h
  #include parser/parsetree.h
  #include parser/scanner.h
+ #include pgstat.h
  #include storage/fd.h
  #include storage/ipc.h
  #include storage/spin.h
***
*** 89,95  PG_MODULE_MAGIC;
   * race conditions.  Besides, we only expect modest, infrequent I/O for query
   * strings, so placing the file on a faster filesystem is not compelling.
   */
! #define PGSS_TEXT_FILE	pg_stat_tmp/pgss_query_texts.stat
  
  /* Magic number identifying the stats file format */
  static const uint32 PGSS_FILE_HEADER = 0x20140125;
--- 90,96 
   * race conditions.  Besides, we only expect modest, infrequent I/O for query
   * strings, so placing the file on a faster filesystem is not compelling.
   */
! #define PGSS_TEXT_FILE	PG_STAT_TMP_DIR /pgss_query_texts.stat
  
  /* Magic number identifying the stats file format */
  static const uint32 PGSS_FILE_HEADER = 0x20140125;
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***
*** 25,30 
--- 25,31 
  #include libpq/pqformat.h
  #include miscadmin.h
  #include nodes/pg_list.h
+ #include pgstat.h
  #include replication/basebackup.h
  #include replication/walsender.h
  #include replication/walsender_private.h
***
*** 63,68  static int	compareWalFileNames(const void *a, const void *b);
--- 64,72 
  /* Was the backup currently in-progress initiated in recovery mode? */
  static bool backup_started_in_recovery = false;
  
+ /* Relative path of temporary statistics directory */
+ static char *statrelpath = NULL;
+ 
  /*
   * Size of each block sent into the tar stream for larger files.
   */
***
*** 111,116  perform_base_backup(basebackup_options *opt, DIR *tblspcdir)
--- 115,132 
    labelfile);
  	SendXlogRecPtrResult(startptr, starttli);
  
+ 	/*
+ 	 * Calculate the relative path of temporary statistics directory
+ 	 * in order to skip the files which are located in that directory later.
+ 	 */
+ 	if (is_absolute_path(pgstat_stat_directory) 
+ 		strncmp(pgstat_stat_directory, DataDir, datadirpathlen) == 0)
+ 		statrelpath = psprintf(./%s, pgstat_stat_directory + datadirpathlen + 1);
+ 	else if (strncmp(pgstat_stat_directory, ./, 2) != 0)
+ 		statrelpath = psprintf(./%s, pgstat_stat_directory);
+ 	else
+ 		statrelpath = pgstat_stat_directory;
+ 
  	PG_ENSURE_ERROR_CLEANUP(base_backup_cleanup, (Datum) 0);
  	{
  		List	   *tablespaces = NIL;
***
*** 838,844  sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
  	sizeof(PG_AUTOCONF_FILENAME) + 4) == 0)
  			continue;
  
- 
  		/*
  		 * If there's a backup_label file, it belongs to a backup started by
  		 * the user with pg_start_backup(). It is *not* correct for this
--- 854,859 
***
*** 888,893  sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
--- 903,922 
  		}
  
  		/*
+ 		 * Skip temporary statistics files. PG_STAT_TMP_DIR must be skipped
+ 		 * even when stats_temp_directory is set because PGSS_TEXT_FILE is
+ 		 * always created there.
+ 		 */
+ 		if ((statrelpath != NULL  strcmp(pathbuf, statrelpath) == 0) ||
+ 			strncmp(de-d_name, PG_STAT_TMP_DIR, strlen(PG_STAT_TMP_DIR)) == 0)
+ 		{
+ 			if (!sizeonly)
+ _tarWriteHeader(pathbuf + basepathlen + 1, NULL, statbuf);
+ 			size += 512;
+ 			continue;
+ 		}
+ 
+ 		/*
  		 * We can skip pg_xlog, the WAL segments need to be fetched from the
  		 * WAL archive anyway. But include it as an empty directory anyway, so
  		 * we get permissions right.
*** a/src/backend/utils/misc/guc.c
--- 

Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-01-31 Thread Mitsumasa KONDO
2014-01-31 Fujii Masao masao.fu...@gmail.com

 On Tue, Jan 28, 2014 at 5:51 PM, Magnus Hagander mag...@hagander.net
 wrote:
  On Tue, Jan 28, 2014 at 6:11 AM, Amit Kapila amit.kapil...@gmail.com
  wrote:
 
  On Tue, Jan 28, 2014 at 9:26 AM, Fujii Masao masao.fu...@gmail.com
  wrote:
   Hi,
  
   The files in pg_stat_tmp directory don't need to be backed up because
   they are
   basically reset at the archive recovery. So I think it's worth
   changing pg_basebackup
   so that it skips any files in pg_stat_tmp directory. Thought?
 
  I think this is good idea, but can't it also avoid
  PGSTAT_STAT_PERMANENT_TMPFILE along with temp files in
  pg_stat_tmp
 
 
  All stats files should be excluded. IIRC the
 PGSTAT_STAT_PERMANENT_TMPFILE
  refers to just the global one. You want to exclude based on
  PGSTAT_STAT_PERMANENT_DIRECTORY (and of course based on the guc
  stats_temp_directory if it's in PGDATA.

 Attached patch changes basebackup.c so that it skips all files in both
 pg_stat_tmp
 and stats_temp_directory. Even when a user sets stats_temp_directory
 to the directory
 other than pg_stat_tmp, we need to skip the files in pg_stat_tmp. Because,
 per recent change of pg_stat_statements, the external query file is
 always created there.

+1.

And, I'd like to also skip pg_log directory because security reason.
If you have time and get community agreed,
could you create these patch after committed your patch?
I don't want to bother you.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-01-31 Thread Fujii Masao
On Fri, Jan 31, 2014 at 10:18 PM, Mitsumasa KONDO
kondo.mitsum...@gmail.com wrote:
 2014-01-31 Fujii Masao masao.fu...@gmail.com

 On Tue, Jan 28, 2014 at 5:51 PM, Magnus Hagander mag...@hagander.net
 wrote:
  On Tue, Jan 28, 2014 at 6:11 AM, Amit Kapila amit.kapil...@gmail.com
  wrote:
 
  On Tue, Jan 28, 2014 at 9:26 AM, Fujii Masao masao.fu...@gmail.com
  wrote:
   Hi,
  
   The files in pg_stat_tmp directory don't need to be backed up because
   they are
   basically reset at the archive recovery. So I think it's worth
   changing pg_basebackup
   so that it skips any files in pg_stat_tmp directory. Thought?
 
  I think this is good idea, but can't it also avoid
  PGSTAT_STAT_PERMANENT_TMPFILE along with temp files in
  pg_stat_tmp
 
 
  All stats files should be excluded. IIRC the
  PGSTAT_STAT_PERMANENT_TMPFILE
  refers to just the global one. You want to exclude based on
  PGSTAT_STAT_PERMANENT_DIRECTORY (and of course based on the guc
  stats_temp_directory if it's in PGDATA.

 Attached patch changes basebackup.c so that it skips all files in both
 pg_stat_tmp
 and stats_temp_directory. Even when a user sets stats_temp_directory
 to the directory
 other than pg_stat_tmp, we need to skip the files in pg_stat_tmp. Because,
 per recent change of pg_stat_statements, the external query file is
 always created there.

 +1.

 And, I'd like to also skip pg_log directory because security reason.

Yeah, I was thinking that, too. I'm not sure whether including log files
in backup really increases the security risk, though. There are already
very important data, i.e., database, in backups. Anyway, since
the amount of log files can be very large and they are not essential
for recovery, it's worth considering whether to exclude them. OTOH,
I'm sure that some users prefer current behavior for some reasons.
So I think that it's better to expose the pg_basebackup option
specifying whether log files are included in backups or not.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-01-31 Thread Mitsumasa KONDO
2014-01-31 Fujii Masao masao.fu...@gmail.com:

 On Fri, Jan 31, 2014 at 10:18 PM, Mitsumasa KONDO
 kondo.mitsum...@gmail.com wrote:
  2014-01-31 Fujii Masao masao.fu...@gmail.com
 
  On Tue, Jan 28, 2014 at 5:51 PM, Magnus Hagander mag...@hagander.net
  wrote:
   On Tue, Jan 28, 2014 at 6:11 AM, Amit Kapila amit.kapil...@gmail.com
 
   wrote:
  
   On Tue, Jan 28, 2014 at 9:26 AM, Fujii Masao masao.fu...@gmail.com
   wrote:
Hi,
   
The files in pg_stat_tmp directory don't need to be backed up
 because
they are
basically reset at the archive recovery. So I think it's worth
changing pg_basebackup
so that it skips any files in pg_stat_tmp directory. Thought?
  
   I think this is good idea, but can't it also avoid
   PGSTAT_STAT_PERMANENT_TMPFILE along with temp files in
   pg_stat_tmp
  
  
   All stats files should be excluded. IIRC the
   PGSTAT_STAT_PERMANENT_TMPFILE
   refers to just the global one. You want to exclude based on
   PGSTAT_STAT_PERMANENT_DIRECTORY (and of course based on the guc
   stats_temp_directory if it's in PGDATA.
 
  Attached patch changes basebackup.c so that it skips all files in both
  pg_stat_tmp
  and stats_temp_directory. Even when a user sets stats_temp_directory
  to the directory
  other than pg_stat_tmp, we need to skip the files in pg_stat_tmp.
 Because,
  per recent change of pg_stat_statements, the external query file is
  always created there.
 
  +1.
 
  And, I'd like to also skip pg_log directory because security reason.

 Yeah, I was thinking that, too. I'm not sure whether including log files
 in backup really increases the security risk, though. There are already
 very important data, i.e., database, in backups. Anyway, since
 the amount of log files can be very large and they are not essential
 for recovery, it's worth considering whether to exclude them. OTOH,
 I'm sure that some users prefer current behavior for some reasons.
 So I think that it's better to expose the pg_basebackup option
 specifying whether log files are included in backups or not.

I'm with you. Thanks a lot !

Regards,
--
Mitsumsasa KONDO
NTT Open Source Software Center


Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-01-31 Thread Robert Haas
On Fri, Jan 31, 2014 at 8:40 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Yeah, I was thinking that, too. I'm not sure whether including log files
 in backup really increases the security risk, though. There are already
 very important data, i.e., database, in backups. Anyway, since
 the amount of log files can be very large and they are not essential
 for recovery, it's worth considering whether to exclude them. OTOH,
 I'm sure that some users prefer current behavior for some reasons.
 So I think that it's better to expose the pg_basebackup option
 specifying whether log files are included in backups or not.

I don't really see how this can work reliably; pg_log isn't a
hard-coded name, but rather a GUC that users can leave set to that
value or set to any other value they choose.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-01-28 Thread Magnus Hagander
On Tue, Jan 28, 2014 at 6:11 AM, Amit Kapila amit.kapil...@gmail.comwrote:

 On Tue, Jan 28, 2014 at 9:26 AM, Fujii Masao masao.fu...@gmail.com
 wrote:
  Hi,
 
  The files in pg_stat_tmp directory don't need to be backed up because
 they are
  basically reset at the archive recovery. So I think it's worth
  changing pg_basebackup
  so that it skips any files in pg_stat_tmp directory. Thought?

 I think this is good idea, but can't it also avoid
 PGSTAT_STAT_PERMANENT_TMPFILE along with temp files in
 pg_stat_tmp


All stats files should be excluded. IIRC the PGSTAT_STAT_PERMANENT_TMPFILE
refers to just the global one. You want to exclude based
on PGSTAT_STAT_PERMANENT_DIRECTORY (and of course based on the
guc stats_temp_directory if it's in PGDATA.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


[HACKERS] pg_basebackup and pg_stat_tmp directory

2014-01-27 Thread Fujii Masao
Hi,

The files in pg_stat_tmp directory don't need to be backed up because they are
basically reset at the archive recovery. So I think it's worth
changing pg_basebackup
so that it skips any files in pg_stat_tmp directory. Thought?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-01-27 Thread Michael Paquier
On Tue, Jan 28, 2014 at 12:56 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Hi,

 The files in pg_stat_tmp directory don't need to be backed up because they are
 basically reset at the archive recovery. So I think it's worth
 changing pg_basebackup
 so that it skips any files in pg_stat_tmp directory. Thought?
Skipping pgstat_temp_directory in basebackup.c would make more sense
than directly touching pg_basebackup.
My 2c.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-01-27 Thread Fujii Masao
On Tue, Jan 28, 2014 at 1:08 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Tue, Jan 28, 2014 at 12:56 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Hi,

 The files in pg_stat_tmp directory don't need to be backed up because they 
 are
 basically reset at the archive recovery. So I think it's worth
 changing pg_basebackup
 so that it skips any files in pg_stat_tmp directory. Thought?
 Skipping pgstat_temp_directory in basebackup.c would make more sense
 than directly touching pg_basebackup.
 My 2c.

Yeah, that's what I was thinking :)

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup and pg_stat_tmp directory

2014-01-27 Thread Amit Kapila
On Tue, Jan 28, 2014 at 9:26 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Hi,

 The files in pg_stat_tmp directory don't need to be backed up because they are
 basically reset at the archive recovery. So I think it's worth
 changing pg_basebackup
 so that it skips any files in pg_stat_tmp directory. Thought?

I think this is good idea, but can't it also avoid
PGSTAT_STAT_PERMANENT_TMPFILE along with temp files in
pg_stat_tmp



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers