[PATCHES] Removing NONSEG mode

2008-04-22 Thread Zdenek Kotala

I attach patch which remove nonsegment mode support. It was discussed during
last commit fest. Nonsegment mode is possible uses only on couple of FS (ZFS,
XFS) and it is not safe on any OS because each OS support more filesystems.

I added RELSEG option to the configure script to allow easily compile with
different segment size (on most filesystem 1T is safe value). As a bonus I added
also BLCKSZ to configure script. It is not important for this patch but it could 
be useful e.g. for buildfarm testing with different BLCKSZ.


Patch requires to run autoconf and autoheader.

Zdenek

PS: --with-segsize=1/1024 allows set segsize to 1MB - good for testing


Index: configure.in
===
RCS file: /zfs_data/cvs_pgsql/cvsroot/pgsql/configure.in,v
retrieving revision 1.555
diff -c -r1.555 configure.in
*** configure.in	30 Mar 2008 04:08:14 -	1.555
--- configure.in	21 Apr 2008 15:19:59 -
***
*** 220,233 
  #
  # Data file segmentation
  #
! PGAC_ARG_BOOL(enable, segmented-files, yes,
!   [  --disable-segmented-files disable data file segmentation (requires largefile support)])
  
  #
  # C compiler
  #
  
  # For historical reasons you can also use --with-CC to specify the C compiler
  # to use, although the standard way to do this is to set the CC environment
  # variable.
  PGAC_ARG_REQ(with, CC, [], [CC=$with_CC])
--- 220,287 
  #
  # Data file segmentation
  #
! AC_MSG_CHECKING([for default relation segment size])
! PGAC_ARG_REQ(with, segsize, [  --with-segsize=RELSEG_SIZE  change default relation segment size in GB [[1]]],
!  [default_segsize=$withval],
!  [default_segsize=1])
! AC_MSG_RESULT([${default_segsize}GB])
! AC_DEFINE_UNQUOTED([RELSEG_SIZE], 1024*1024*1024LL*${default_segsize}/BLCKSZ, [
!  RELSEG_SIZE is the maximum number of blocks allowed in one disk
!  file. Thus, the maximum size of a single file is RELSEG_SIZE * BLCKSZ;
!  relations bigger than that are divided into multiple files.
!  
!  RELSEG_SIZE * BLCKSZ must be less than your OS' limit on file size.
!  This is often 2 GB or 4GB in a 32-bit operating system, unless you
!  have large file support enabled.  By default, we make the limit 1
!  GB to avoid any possible integer-overflow problems within the OS.
!  A limit smaller than necessary only means we divide a large
!  relation into more chunks than necessary, so it seems best to err
!  in the direction of a small limit.  (Besides, a power-of-2 value
!  saves a few cycles in md.c.)
  
+  Changing RELSEG_SIZE requires an initdb.
+ ])
+ AC_SUBST(default_segsize)
+ 
+ #
+ # Block size
  #
+ AC_MSG_CHECKING([for default block size])
+ PGAC_ARG_REQ(with, blocksize, [  --with-blocksize=BLCKSZ change default block size (1,2,4,8,16,32 are allowed values). [[8]]],
+  [default_blocksize=$withval],
+  [default_blocksize=8])
+ case ${default_blocksize} in
+   1) default_blocksize=1024;;
+   2) default_blocksize=2048;;
+   4) default_blocksize=4096;;
+   8) default_blocksize=8192;;
+  16) default_blocksize=16384;;
+  32) default_blocksize=32768;;
+   *) AC_MSG_ERROR([Invalid block size. Allowed values are 1,2,4,8,16,32.])
+ esac
+ 
+ AC_MSG_RESULT([${default_blocksize}B])
+ AC_DEFINE_UNQUOTED([BLCKSZ], ${default_blocksize}, [
+  Size of a disk block --- this also limits the size of a tuple.  You
+  can set it bigger if you need bigger tuples (although TOAST should
+  reduce the need to have large tuples, since fields can be spread
+  across multiple tuples).
+  
+  BLCKSZ must be a power of 2.  The maximum possible value of BLCKSZ
+  is currently 2^15 (32768).  This is determined by the 15-bit widths
+  of the lp_off and lp_len fields in ItemIdData (see
+  include/storage/itemid.h).
+  
+  Changing BLCKSZ requires an initdb.
+ ]) 
+ AC_SUBST(default_blocksize)
+ 
+ 
  # C compiler
  #
  
  # For historical reasons you can also use --with-CC to specify the C compiler
+ 
  # to use, although the standard way to do this is to set the CC environment
  # variable.
  PGAC_ARG_REQ(with, CC, [], [CC=$with_CC])
***
*** 1435,1443 
  
  # Check for largefile support (must be after AC_SYS_LARGEFILE)
  AC_CHECK_SIZEOF([off_t])
! 
! if test $ac_cv_sizeof_off_t -lt 8 -o $enable_segmented_files = yes; then 
!   AC_DEFINE([USE_SEGMENTED_FILES], 1, [Define to split data files into 1GB segments.]) 
  fi
  
  # SunOS doesn't handle negative byte comparisons properly with +/- return
--- 1489,1496 
  
  # Check for largefile support (must be after AC_SYS_LARGEFILE)
  AC_CHECK_SIZEOF([off_t])
! if test $ac_cv_sizeof_off_t -lt 8 -a $default_segsize != 1; then 
!AC_MSG_ERROR([Large file support is not enabled. Segment size cannot be larger then 1GB.]) 
  fi
  
  # SunOS doesn't handle negative byte comparisons properly with +/- return
Index: src/backend/storage/file/buffile.c
===

Re: [PATCHES] Improve shutdown during online backup, take 3

2008-04-22 Thread Albe Laurenz
Simon Riggs wrote:
 Patch applies, and works as described. Looks good for final apply.
 
 Few minor thoughts:
 
 * Text in pg_ctl should be WARNING, not Warning.
 * CancelBackup() API looks strange, not sure why
 * Need to mention that CancelBackup() is not the right way to end a
 backup, so that function and pg_stop_backup should reference 
 each other
 
 Other than those, I like it. Very useful patch.

Thanks for the feedback!

- I have replaced Warning with WARNING.
- I have changed the API of CancelBackup() to return void.
  I don't use the return code anyway, and I guess it's less confusing
  and strange that way.
- I have added comments to disambiguate pg_stop_backup() and CancelBackup().

CancelBackup now writes a message to the server log if it cannot delete
backup_label - I hope that's not too verbose...

Yours,
Laurenz Albe


backup-shut.patch
Description: backup-shut.patch


backup-shut.doc.patch
Description: backup-shut.doc.patch

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Removing NONSEG mode

2008-04-22 Thread Alvaro Herrera
Zdenek Kotala wrote:
 I attach patch which remove nonsegment mode support. It was discussed during
 last commit fest. Nonsegment mode is possible uses only on couple of FS (ZFS,
 XFS) and it is not safe on any OS because each OS support more filesystems.

Added to May commitfest, thanks.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 3

2008-04-22 Thread Magnus Hagander
Albe Laurenz wrote:
 Simon Riggs wrote:
  Patch applies, and works as described. Looks good for final apply.
  
  Few minor thoughts:
  
  * Text in pg_ctl should be WARNING, not Warning.
  * CancelBackup() API looks strange, not sure why
  * Need to mention that CancelBackup() is not the right way to end a
  backup, so that function and pg_stop_backup should reference 
  each other
  
  Other than those, I like it. Very useful patch.
 
 Thanks for the feedback!
 
 - I have replaced Warning with WARNING.
 - I have changed the API of CancelBackup() to return void.
   I don't use the return code anyway, and I guess it's less confusing
   and strange that way.
 - I have added comments to disambiguate pg_stop_backup() and
 CancelBackup().
 
 CancelBackup now writes a message to the server log if it cannot
 delete backup_label - I hope that's not too verbose...

This doesn't look like our normal coding standards, and should probably
be changed:
+   if (0 != stat(BACKUP_LABEL_FILE, stat_buf))

(there's a number of similar places)

//Magnus

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Partial match in GIN (next vesrion)

2008-04-22 Thread Teodor Sigaev

http://www.sigaev.ru/misc/partial_match_gin-0.9.gz
Sync with CVS.

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


[PATCHES] Snapshot management, final

2008-04-22 Thread Alvaro Herrera
Here is the patch for snapshot management I currently have.

Below is a complete description of the changes in this patch.

The most important change is on the use of CopySnapshot and the games
with ActiveSnapshot.  On the new code, whenever a piece of code needs a
snapshot to persist, it is registered by using RegisterSnapshot.  As
soon as it is done with it, it must be unregistered with
UnregisterSnapshot.

For ActiveSnapshot, we deal with a stack structure.  Callers needing an
Active snap set call PushActiveSnapshot, and PopActiveSnapshot when they
no longer need it.

GetSnapshotData still creates initial versions of each snapshot in
static memory.  There's a new boolean on the SnapshotData struct (set to
false by GetSnapshotData), which indicates whether the snapshot has been
copied out of static memory.  When a snapshot is Registered or
PushedActive, this bit is checked: if false, then a copy of the snapshot
is made on the dedicated SnapshotContext, and the copied flag is
flipped.

SnapshotData also carries two new reference counts: one for the Register
list, another one for the ActiveSnapshot stack.  Whenever a snap is
Unregistered or Popped, both refcounts are checked.  If both are found
to be zero then memory can be released.

On Unregister and Pop, after deleting the snapshot, we also verify the
whole lists.  If they are empty, it means no more live snapshot remain
for this transaction.  In this case we can reset MyProc-xmin to
InvalidXid.  Thus, GetSnapshotData must now recalculate MyProc-xmin
each time it finds xmin to be Invalid, which can not only happen for the
serializable snapshot but at any time.

A serializable transaction Registers its snapshot as first thing, and
keeps it registered until transaction end.


Note: I had previously made noises about changing the semantics of
MyProc-xmin or introducing a new VacuumXmin.  I have refrained from
doing so in this patch.


Those are the high-level changes.  Some coding changes:

- SerializableSnapshot and LatestSnapshot as global symbols are no more.

- A couple of PG_TRY blocks have been removed.  Particularly, on
plancache.c this means the do_planning() routine is now pointless and
has been folded back into its sole caller.

- Three CopySnapshot call sites remain outside snapmgr.c: DoCopy() on
copy.c, ExplainOnePlan() on explain.c and _SPI_execute_plan() on spi.c.
They are there because they grab the current ActiveSnapshot, modify it,
and then use the resulting snapshot.  There is no corresponding
FreeSnapshot, because it's not needed.

- CommandCounterIncrement now calls into snapmgr.c to set the CID of the
static snapshots.

- CREATE INDEX CONCURRENTLY, VACUUM FULL, and CLUSTER must now
explicitely Pop the ActiveSnapshot set by PortalRunUtility before
calling CommitTransactionCommand.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


snapshot-6.patch.bz2
Description: Binary data

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Snapshot management, final

2008-04-22 Thread Simon Riggs
On Tue, 2008-04-22 at 15:49 -0400, Alvaro Herrera wrote:

 - Three CopySnapshot call sites remain outside snapmgr.c: DoCopy() on
 copy.c, ExplainOnePlan() on explain.c and _SPI_execute_plan() on spi.c.
 They are there because they grab the current ActiveSnapshot, modify it,
 and then use the resulting snapshot.  There is no corresponding
 FreeSnapshot, because it's not needed.

Not needed? How can we be certain that the modified snapshot does not
outlive its original source?

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Snapshot management, final

2008-04-22 Thread Alvaro Herrera
Simon Riggs wrote:
 On Tue, 2008-04-22 at 15:49 -0400, Alvaro Herrera wrote:
 
  - Three CopySnapshot call sites remain outside snapmgr.c: DoCopy() on
  copy.c, ExplainOnePlan() on explain.c and _SPI_execute_plan() on spi.c.
  They are there because they grab the current ActiveSnapshot, modify it,
  and then use the resulting snapshot.  There is no corresponding
  FreeSnapshot, because it's not needed.
 
 Not needed? How can we be certain that the modified snapshot does not
 outlive its original source?

It's not CopySnapshot that's not needed, but FreeSnapshot.  The point
here is that the snapshot will be freed automatically as soon as it is
PopActiveSnapshot'd out of existance.  CopySnapshot creates a new,
separate copy of the passed snapshot, and each of them will be freed
(separately) as soon as their refcounts reach zero.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Snapshot management, final

2008-04-22 Thread Simon Riggs
On Tue, 2008-04-22 at 17:50 -0400, Alvaro Herrera wrote:
 Simon Riggs wrote:
  On Tue, 2008-04-22 at 15:49 -0400, Alvaro Herrera wrote:
  
   - Three CopySnapshot call sites remain outside snapmgr.c: DoCopy() on
   copy.c, ExplainOnePlan() on explain.c and _SPI_execute_plan() on spi.c.
   They are there because they grab the current ActiveSnapshot, modify it,
   and then use the resulting snapshot.  There is no corresponding
   FreeSnapshot, because it's not needed.
  
  Not needed? How can we be certain that the modified snapshot does not
  outlive its original source?
 
 It's not CopySnapshot that's not needed, but FreeSnapshot.  The point
 here is that the snapshot will be freed automatically as soon as it is
 PopActiveSnapshot'd out of existance.  CopySnapshot creates a new,
 separate copy of the passed snapshot, and each of them will be freed
 (separately) as soon as their refcounts reach zero.

OK, so it can;t be copied to a longer lived memory context?

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Snapshot management, final

2008-04-22 Thread Alvaro Herrera
Simon Riggs wrote:

 OK, so it can;t be copied to a longer lived memory context?

CopySnapshot always copies snapshots to SnapshotContext, which is a
context that lives until transaction end.  There's no mechanism for
copying a snapshot into another context, because I don't see the need.

If you need that ability, please explain.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Snapshot management, final

2008-04-22 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Simon Riggs wrote:
 OK, so it can;t be copied to a longer lived memory context?

 CopySnapshot always copies snapshots to SnapshotContext, which is a
 context that lives until transaction end.  There's no mechanism for
 copying a snapshot into another context, because I don't see the need.

The only reason we have memory contexts at all is to avoid the need to
track individual palloc'd objects.  Since we're instituting exactly such
tracking for snapshots, there's no value in placing them in
general-purpose memory contexts.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Snapshot management, final

2008-04-22 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
 
  CopySnapshot always copies snapshots to SnapshotContext, which is a
  context that lives until transaction end.  There's no mechanism for
  copying a snapshot into another context, because I don't see the need.
 
 The only reason we have memory contexts at all is to avoid the need to
 track individual palloc'd objects.  Since we're instituting exactly such
 tracking for snapshots, there's no value in placing them in
 general-purpose memory contexts.

The problem is that we reuse snapshots, and not all uses have the same
longevity.  If a context goes away from under a snapshot and there are
other references to it, the result is a dangling pointer somewhere.
That's why we have reference counts on snaps: we know we can free one
when its refcounts are zero.  At the same time, the snapshots all go
away at transaction end with TopTransactionContext.

The other possible approach to this problem is creating a separate copy
each time a snapshot is reused, but this just causes extra palloc'ing
for no gain at all.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches