Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-19 Thread Peter Eisentraut
Zdenek Kotala wrote:
 But how it was mentioned in this thread maybe
 somethink like this CREATE TABLESPACE name LOCATION '/my/location'
 SEGMENTS 10GB should good solution. If segments is not mentioned then
 default value is used.

I think you would need a tool to resegmentize a table or tablespace offline, 
usable for example when recovering a backup.

Also, tablespace configuration information is of course also stored in a 
table.  pg_tablespace probably won't become large, but it would probably 
still need to be special-cased, along with other system catalogs perhaps.

An then, how to coordindate offline resegmenting and online tablespace 
operations in a crash-safe way?

Another factor I just thought of is that tar, commonly used as part of a 
backup procedure, can on some systems only handle files up to 8 GB in size.  
There are supposed to be newer formats that can avoid that restriction, but 
it's not clear how widely available these are and what the incantation is to 
get at them.  Of course we don't use tar directly, but if we ever make large 
segments the default, we ought to provide some clear advice for the user on 
how to make their backups.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-19 Thread Martijn van Oosterhout
On Wed, Mar 19, 2008 at 09:38:12AM +0100, Peter Eisentraut wrote:
 Another factor I just thought of is that tar, commonly used as part of a 
 backup procedure, can on some systems only handle files up to 8 GB in size.  
 There are supposed to be newer formats that can avoid that restriction, but 
 it's not clear how widely available these are and what the incantation is to 
 get at them.  Of course we don't use tar directly, but if we ever make large 
 segments the default, we ought to provide some clear advice for the user on 
 how to make their backups.

By my reading, GNU tar handles larger files and no-one else (not even a
POSIX standard tar) can...

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-19 Thread Kenneth Marshall
On Wed, Mar 19, 2008 at 10:51:12AM +0100, Martijn van Oosterhout wrote:
 On Wed, Mar 19, 2008 at 09:38:12AM +0100, Peter Eisentraut wrote:
  Another factor I just thought of is that tar, commonly used as part of a 
  backup procedure, can on some systems only handle files up to 8 GB in size. 
   
  There are supposed to be newer formats that can avoid that restriction, but 
  it's not clear how widely available these are and what the incantation is 
  to 
  get at them.  Of course we don't use tar directly, but if we ever make 
  large 
  segments the default, we ought to provide some clear advice for the user on 
  how to make their backups.
 
 By my reading, GNU tar handles larger files and no-one else (not even a
 POSIX standard tar) can...
 
 Have a nice day,
 -- 
 Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
  Please line up in a tree and maintain the heap invariant while 
  boarding. Thank you for flying nlogn airlines.

The star program written by Joerg Schilling is a very well written
POSIX compatible tar program that can easily handle files larger than
8GB. It is another backup option.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-19 Thread Zdeněk Kotala

Peter Eisentraut napsal(a):

Zdenek Kotala wrote:

But how it was mentioned in this thread maybe
somethink like this CREATE TABLESPACE name LOCATION '/my/location'
SEGMENTS 10GB should good solution. If segments is not mentioned then
default value is used.


I think you would need a tool to resegmentize a table or tablespace offline, 
usable for example when recovering a backup.


Do you mean something like strip(1) command? I don't see any usecase for 
terrabytes data. You usually have a problem to find place where you can backup.


Also, tablespace configuration information is of course also stored in a 
table.  pg_tablespace probably won't become large, but it would probably 
still need to be special-cased, along with other system catalogs perhaps.


It is true and unfortunately singularity. Same as database list which is in a 
table as well, but it is stored also as a text file for startup purpose. I more 
incline to use non table configuration file for tablespaces, because I don't see 
any advantage to have it under MVCC control and it allow also to define storage 
for pg_global and pg_default.


An then, how to coordindate offline resegmenting and online tablespace 
operations in a crash-safe way?


Another factor I just thought of is that tar, commonly used as part of a 
backup procedure, can on some systems only handle files up to 8 GB in size.  
There are supposed to be newer formats that can avoid that restriction, but 
it's not clear how widely available these are and what the incantation is to 
get at them.  Of course we don't use tar directly, but if we ever make large 
segments the default, we ought to provide some clear advice for the user on 
how to make their backups.


I think tar is OK - minimal on Solaris. See man largefile.

Default segment size still should be 1GB. If DBA makes a decision to increase 
this to higher value, then it is his responsibility to find way how to process 
this big files.


Zdenek


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Peter Eisentraut
Tom Lane wrote:
 I think this needs to be treated as experimental until it's got a few
 more than zero miles under its belt.

OK, then maybe we should document that.

 I wouldn't be too surprised to 
 find that we have to implement it as a run-time switch instead of
 compile-time, in order to not fail miserably when somebody sticks a
 tablespace on an archaic filesystem.

Yes, that sounds quite useful.  Let's wait and see what happens.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Zeugswetter Andreas OSB SD

  Why is this not the default when supported?
 
 Fear.
 
 Maybe eventually, but right now I think it's too risky.
 
 One point that I already found out the hard way is that sizeof(off_t)
= 8
 does not guarantee the availability of largefile support; there can
also
 be filesystem-level constraints, and perhaps other things we know not
of
 at this point.

Exactly, e.g. AIX is one of those. jfs (not the newer jfs2) has an
option
to enable large files, which is not the default and cannot be changed
post crfs.
And even if it is enabled, jfs has a 64 Gb filesize limit !
Anybody know others that support large but not huge files ?

Andreas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Zeugswetter Andreas OSB SD

  Why is this not the default when supported?  I am wondering both
from the 
  point of view of the user, and in terms of development direction.
 
 Also it would get more buildfarm coverage if it were default.  If it
 breaks something we'll notice earlier.

No we don't, because the buildfarm does not test huge files.

Andreas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Larry Rosenman

On Mon, 10 Mar 2008, Tom Lane wrote:


Peter Eisentraut [EMAIL PROTECTED] writes:

Tom Lane wrote:

Applied with minor corrections.



Why is this not the default when supported?


Fear.

Maybe eventually, but right now I think it's too risky.

One point that I already found out the hard way is that sizeof(off_t) = 8
does not guarantee the availability of largefile support; there can also
be filesystem-level constraints, and perhaps other things we know not of
at this point.


Just to note an additional filesystem that will need special action...
The VxFS filesystem has a largefiles option, per filesystem.  At least that
was the case on SCO UnixWare (No, I no longer run it).

LER




regards, tom lane




--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 512-248-2683 E-Mail: [EMAIL PROTECTED]
US Mail: 430 Valona Loop, Round Rock, TX 78681-3893

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I think this needs to be treated as experimental until it's got a few
 more than zero miles under its belt.

 OK, then maybe we should document that.

Agreed, but at this point we don't even know what hazards we need to
document.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Tom Lane
Zeugswetter Andreas OSB SD [EMAIL PROTECTED] writes:
 Exactly, e.g. AIX is one of those. jfs (not the newer jfs2) has an
 option
 to enable large files, which is not the default and cannot be changed
 post crfs.
 And even if it is enabled, jfs has a 64 Gb filesize limit !
 Anybody know others that support large but not huge files ?

Yeah, HPUX 10 is similar --- 128GB hard maximum.  It does say you
can convert an existing filesystem to largefile support, but it has
to be unmounted.

These examples suggest that maybe what we want is not so much a no
segments ever mode as a segment size larger than 1GB.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Zdenek Kotala

Tom Lane napsal(a):

Zeugswetter Andreas OSB SD [EMAIL PROTECTED] writes:

Exactly, e.g. AIX is one of those. jfs (not the newer jfs2) has an
option
to enable large files, which is not the default and cannot be changed
post crfs.
And even if it is enabled, jfs has a 64 Gb filesize limit !
Anybody know others that support large but not huge files ?


Yeah, HPUX 10 is similar --- 128GB hard maximum.  It does say you
can convert an existing filesystem to largefile support, but it has
to be unmounted.

These examples suggest that maybe what we want is not so much a no
segments ever mode as a segment size larger than 1GB.


Patch allows to use bigger than 2/4GB segment files and it is possible 
changed it in source file. But how it was mentioned in this thread maybe 
somethink like this CREATE TABLESPACE name LOCATION '/my/location' 
SEGMENTS 10GB should good solution. If segments is not mentioned then 
default value is used.



Zdenek

PS: ZFS is happy with 2^64bit size and UFS has 1TB file size limit 
(depends on solaris version)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Tom Lane
Zdenek Kotala [EMAIL PROTECTED] writes:
 Tom Lane napsal(a):
 These examples suggest that maybe what we want is not so much a no
 segments ever mode as a segment size larger than 1GB.

 PS: ZFS is happy with 2^64bit size and UFS has 1TB file size limit 
 (depends on solaris version)

So even on Solaris, no segments ever is actually a pretty awful idea.
As it stands, the code would fail on tables  1TB.

I'm thinking we need to reconsider this patch.  Rather than disabling
segmentation altogether, we should see it as allowing use of segments
larger than 1GB.  I suggest that we ought to just flat rip out the non
segmenting code paths in md.c, and instead look into what segment sizes
are appropriate on different platforms.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Bruce Momjian
Tom Lane wrote:
 Zdenek Kotala [EMAIL PROTECTED] writes:
  Tom Lane napsal(a):
  These examples suggest that maybe what we want is not so much a no
  segments ever mode as a segment size larger than 1GB.
 
  PS: ZFS is happy with 2^64bit size and UFS has 1TB file size limit 
  (depends on solaris version)
 
 So even on Solaris, no segments ever is actually a pretty awful idea.
 As it stands, the code would fail on tables  1TB.
 
 I'm thinking we need to reconsider this patch.  Rather than disabling
 segmentation altogether, we should see it as allowing use of segments
 larger than 1GB.  I suggest that we ought to just flat rip out the non
 segmenting code paths in md.c, and instead look into what segment sizes
 are appropriate on different platforms.

Agreed.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Zdenek Kotala

Tom Lane napsal(a):

Zdenek Kotala [EMAIL PROTECTED] writes:

Tom Lane napsal(a):

These examples suggest that maybe what we want is not so much a no
segments ever mode as a segment size larger than 1GB.


PS: ZFS is happy with 2^64bit size and UFS has 1TB file size limit 
(depends on solaris version)


So even on Solaris, no segments ever is actually a pretty awful idea.
As it stands, the code would fail on tables  1TB.

I'm thinking we need to reconsider this patch.  Rather than disabling
segmentation altogether, we should see it as allowing use of segments
larger than 1GB.  I suggest that we ought to just flat rip out the non
segmenting code paths in md.c, and instead look into what segment sizes
are appropriate on different platforms.


Yes, agree. It seems only ZFS is OK at this moment and if somebody sets 
32TB he gets nonsegment mode anyway. I looked into posix standard and 
there is useful function which can be used. See


http://www.opengroup.org/onlinepubs/009695399/functions/pathconf.html

Maybe we can put additional test into configure and collect appropriate 
data from buildfarm.


I think current patch could stay in CVS and I will rip out non segment 
code path in a new patch.


Zdenek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Tom Lane
Zdenek Kotala [EMAIL PROTECTED] writes:
 I think current patch could stay in CVS and I will rip out non segment 
 code path in a new patch.

Sure, I feel no need to revert what's applied.  Have at it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Peter Eisentraut
Zdenek Kotala wrote:
 Yes, agree. It seems only ZFS is OK at this moment and if somebody sets
 32TB he gets nonsegment mode anyway.

Surely if you set the segment size to INT64_MAX, you will get nonsegmented 
behavior anyway, so two code paths might not be necessary at all.

 I looked into posix standard and 
 there is useful function which can be used. See

 http://www.opengroup.org/onlinepubs/009695399/functions/pathconf.html

 Maybe we can put additional test into configure and collect appropriate
 data from buildfarm.

It might be good to just check first if it returns realistic values for the 
example cases that have been mentioned.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-11 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Zdenek Kotala wrote:
 Maybe we can put additional test into configure and collect appropriate
 data from buildfarm.

 It might be good to just check first if it returns realistic values for the 
 example cases that have been mentioned.

Yeah, please just make up a ten-line C program that prints the numbers
you want, and post it on -hackers for people to try.  If manual testing
says that it's printing useful numbers, then it would be time enough to
think about how to get it into the buildfarm.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-10 Thread Tom Lane
Zdenek Kotala [EMAIL PROTECTED] writes:
 There is latest version of nonsegment support patch. I changed 
 LET_OS_MANAGE_FILESIZE to USE_SEGMENTED_FILES and I added 
 -disable-segmented-files switch to configure. I kept tuplestore behavior 
 and it still split file in both mode.

Applied with minor corrections.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-10 Thread Peter Eisentraut
Tom Lane wrote:
 Zdenek Kotala [EMAIL PROTECTED] writes:
  There is latest version of nonsegment support patch. I changed
  LET_OS_MANAGE_FILESIZE to USE_SEGMENTED_FILES and I added
  -disable-segmented-files switch to configure. I kept tuplestore behavior
  and it still split file in both mode.

 Applied with minor corrections.

Why is this not the default when supported?  I am wondering both from the 
point of view of the user, and in terms of development direction.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-10 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Applied with minor corrections.

 Why is this not the default when supported?

Fear.

Maybe eventually, but right now I think it's too risky.

One point that I already found out the hard way is that sizeof(off_t) = 8
does not guarantee the availability of largefile support; there can also
be filesystem-level constraints, and perhaps other things we know not of
at this point.

I think this needs to be treated as experimental until it's got a few
more than zero miles under its belt.  I wouldn't be too surprised to
find that we have to implement it as a run-time switch instead of
compile-time, in order to not fail miserably when somebody sticks a
tablespace on an archaic filesystem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-10 Thread Alvaro Herrera
Peter Eisentraut wrote:
 Tom Lane wrote:
  Zdenek Kotala [EMAIL PROTECTED] writes:
   There is latest version of nonsegment support patch. I changed
   LET_OS_MANAGE_FILESIZE to USE_SEGMENTED_FILES and I added
   -disable-segmented-files switch to configure. I kept tuplestore behavior
   and it still split file in both mode.
 
  Applied with minor corrections.
 
 Why is this not the default when supported?  I am wondering both from the 
 point of view of the user, and in terms of development direction.

Also it would get more buildfarm coverage if it were default.  If it
breaks something we'll notice earlier.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2008-03-10 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Also it would get more buildfarm coverage if it were default.  If it
 breaks something we'll notice earlier.

Since nothing the regression tests do even approach 1GB, the odds that
the buildfarm will notice problems are approximately zero.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2007-04-30 Thread Zdenek Kotala

Tom Lane wrote:

[ redirecting to -hackers for wider comment ]

Zdenek Kotala [EMAIL PROTECTED] writes:

Tom Lane wrote:
LET_OS_MANAGE_FILESIZE is good way. I think one problem of this option I 
fixed. It is size of offset. I went thru the code and did not see any 
other problem there. However, how you mentioned it need more testing. I 
going to take server with large disk array and I will test it.


I would like to add --enable-largefile switch to configure file to 
enable access to wide group of users. What you think about it?


Yeah, I was going to suggest the same thing --- but not with that switch
name.  We already use enable/disable-largefile to control whether 64-bit
file access is built at all (this mostly affects pg_dump at the moment).

I think the clearest way might be to flip the sense of the variable.
I never found LET_OS_MANAGE_FILESIZE to be a good name anyway.  I'd
suggest USE_SEGMENTED_FILES, which defaults to on, and you can
turn it off via --disable-segmented-files if configure confirms your
OS has largefile support (thus you could not specify both this and
--disable-largefile).



There is latest version of nonsegment support patch. I changed 
LET_OS_MANAGE_FILESIZE to USE_SEGMENTED_FILES and I added 
-disable-segmented-files switch to configure. I kept tuplestore behavior 
and it still split file in both mode.


I also little bit cleanup some other datatypes (e.g int-mode_t).
Autoconf and autoheader must be run after patch application.

I tested it with 9GB table and both mode works fine.

Please, let me know your comments.

Zdenek


nonseg.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match