Re: [HACKERS] pg_dump and large files - is this a problem?

2002-11-04 Thread Bruce Momjian

Does this resolve our AIX compile problem?

---

Tom Lane wrote:
> "Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes:
> > The issue is, that you need to remove the #include "bootstrap_tokens.h"
> > line from the lex file.
> 
> Good point; I'm surprised gcc doesn't spit up on that.  I've made that
> mod and also added the inclusion-order-correction in pqsignal.c.
> 
>   regards, tom lane
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-11-04 Thread Tom Lane
"Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes:
> The issue is, that you need to remove the #include "bootstrap_tokens.h"
> line from the lex file.

Good point; I'm surprised gcc doesn't spit up on that.  I've made that
mod and also added the inclusion-order-correction in pqsignal.c.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-11-04 Thread Zeugswetter Andreas SB SD

Tom Lane writes:
> > I think the problem is more accurately described thus:  Flex generated
> > files include  before "postgres.h" due to the way it lays out the
> > code in the output.  stdio.h does something which prevents switching to
> > the large file model later on in postgres.h.  (This manifests itself in
> > unistd.h, but unistd.h itself is not the problem per se.)
> 
> > The proposed fix was to include the flex output in some other file (such
> > as the corresponding grammar file) rather than to compile it separately.
> 
> I have made this change.  CVS tip should compile cleanly now on machines
> where this is an issue.

Hmm, sorry for the late response, but I was away on the (long) weekend :-(
I think your patch might be the source for Christopher's build problem 
(Compile problem on FreeBSD/Alpha).

Peter already had a patch, that I tested, modified a little, and sent him back
for inclusion into CVS.

I will attach his patch with my small fixes for cross reference.
The issue is, that you need to remove the #include "bootstrap_tokens.h"
line from the lex file.

Andreas



flex-patch2.gz
Description: flex-patch2.gz

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-11-01 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> I think the problem is more accurately described thus:  Flex generated
> files include  before "postgres.h" due to the way it lays out the
> code in the output.  stdio.h does something which prevents switching to
> the large file model later on in postgres.h.  (This manifests itself in
> unistd.h, but unistd.h itself is not the problem per se.)

> The proposed fix was to include the flex output in some other file (such
> as the corresponding grammar file) rather than to compile it separately.

I have made this change.  CVS tip should compile cleanly now on machines
where this is an issue.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-31 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> The proposed fix was to include the flex output in some other file (such
> as the corresponding grammar file) rather than to compile it separately.

Seems like a reasonable solution.  Can you make that happen in the next
day or two?  If not, I'll take a whack at it ...

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-29 Thread Peter Eisentraut
Zeugswetter Andreas SB SD writes:

> > AIX is too stupid to wrap unistd.h in an "#ifndef" to protect against
> > double inclusion?  I suppose we could do that for them...
>
> I guess that is exactly not wanted, since that would hide the actual
> problem, namely that _LARGE_FILE_API gets defined (off_t --> 32bit).
> Thus I think IBM did not protect unistd.h on purpose.

I think the problem is more accurately described thus:  Flex generated
files include  before "postgres.h" due to the way it lays out the
code in the output.  stdio.h does something which prevents switching to
the large file model later on in postgres.h.  (This manifests itself in
unistd.h, but unistd.h itself is not the problem per se.)

The proposed fix was to include the flex output in some other file (such
as the corresponding grammar file) rather than to compile it separately.
The patch just needs to be tried out.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-29 Thread Zeugswetter Andreas SB SD

> >> Yeah.  AFAICS the only way around this is to avoid doing any I/O
> >> operations in the flex-generated files.  Fortunately, 
> that's not much
> >> of a restriction.
> 
> > Unfortunately I do not think that is sufficient, since the problem is already
> > at the #include level. The compiler barfs on the second #include 
> > from postgres.h
> 
> AIX is too stupid to wrap unistd.h in an "#ifndef" to protect against
> double inclusion?  I suppose we could do that for them...

I guess that is exactly not wanted, since that would hide the actual
problem, namely that _LARGE_FILE_API gets defined (off_t --> 32bit).
Thus I think IBM did not protect unistd.h on purpose.

Andreas

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-29 Thread Tom Lane
"Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes:
>> Yeah.  AFAICS the only way around this is to avoid doing any I/O
>> operations in the flex-generated files.  Fortunately, that's not much
>> of a restriction.

> Unfortunately I do not think that is sufficient, since the problem is already
> at the #include level. The compiler barfs on the second #include 
> from postgres.h

AIX is too stupid to wrap unistd.h in an "#ifndef" to protect against
double inclusion?  I suppose we could do that for them...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-29 Thread Zeugswetter Andreas SB SD

> > The problem with flex is, that the generated c file does #include 
> > before we #include "postgres.h".
> > In this situation _LARGE_FILES is not defined for unistd.h and unistd.h
> > chooses to define _LARGE_FILE_API, those two are not compatible.
> 
> Yeah.  AFAICS the only way around this is to avoid doing any I/O
> operations in the flex-generated files.  Fortunately, that's not much
> of a restriction.

Unfortunately I do not think that is sufficient, since the problem is already 
at the #include level. The compiler barfs on the second #include 
from postgres.h

Andreas

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-28 Thread Tom Lane
"Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes:
> The problem with flex is, that the generated c file does #include 
> before we #include "postgres.h".
> In this situation _LARGE_FILES is not defined for unistd.h and unistd.h
> chooses to define _LARGE_FILE_API, those two are not compatible.

Yeah.  AFAICS the only way around this is to avoid doing any I/O
operations in the flex-generated files.  Fortunately, that's not much
of a restriction.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-28 Thread Zeugswetter Andreas SB SD

> > > The question is *which* seek APIs we need to support.  Are there any
> > > besides fseeko() and fgetpos()?
> > 
> > On AIX we have 
> > int fseeko64 (FILE* Stream, off64_t Offset, int Whence);
> > which is intended for large file access for programs that do NOT
> > #define _LARGE_FILES
> > 
> > It is functionality that is available if _LARGE_FILE_API is defined,
> > which is the default if _LARGE_FILES is not defined.
> > 
> > That would have been my preferred way of handling large files on AIX
> > in the two/three? places that need it (pg_dump/restore, psql and backend COPY).
> > This would have had the advantage that off_t is not 64 bit in all other places
> > where it is actually not needed, no ?
> 
> OK, I am focusing on AIX now.  I don't think we can go down the road of
> saying where large file support is needed or not needed.  I think for
> each platform either we support large files or we don't.  Is there a way
> to have off_t be 64 bits everywhere, and if it is, why wouldn't we just
> enable that rather than poke around figuring out where it is needed?

if _LARGE_FILES is defined, off_t is 64 bits on AIX (and fseeko works). 
The problem with flex is, that the generated c file does #include 
before we #include "postgres.h".
In this situation _LARGE_FILES is not defined for unistd.h and unistd.h
chooses to define _LARGE_FILE_API, those two are not compatible.

If a general off_t of 64 bits is no performance problem, we should focus
on fixing the #include  issue, and forget what I wanted/hinted.
Peter E. has a patch for this in his pipeline. I can give it a second try 
tomorrow.

Sorry for the late answer, I am very pressed currently :-(
Andreas

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Philip Warner
At 11:51 PM 24/10/2002 -0400, Bruce Momjian wrote:

Your idea of using SEEK_SET is good, except I was concerned that the
checkSeek call will move the file pointer.  Is that OK?  It doesn't seem
appropriate.


The call is made just after the file is opened (or it should be!), so 
SEEK_SET, 0 will not be a problem.




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Philip Warner
At 12:07 AM 25/10/2002 -0400, Bruce Momjian wrote:

I don't think we can assume that off_t can be passed to fset/getpos
unless we know the platform supports it, unless people think fpos_t
being integral and the same size as fpos_t is enough.


We don't need to. We would #define FILE_OFFSET as fpos_t in that case.


Also, I don't think these can be done a macro, perhaps
fseeko(...,SEEK_SET), but not the others, and not ftello.  See
port/fseeko.c for the reason.


My understanding was that you could define a block and declare variables in 
a macro; just use a local for the temp storage of the in/out args. If this 
is the only thing stopping you adopting this approach, then I am very happy 
to try to code the macros properly.

However, I do get the impression that there is more resistance to the idea 
than just this.




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Bruce Momjian
Philip Warner wrote:
> Rather than having a different patch file for each platform and refusing to 
> code fseek/tell because we can't do SEEK_CUR, why not check for FSEEKO64 
> and revert to a simple solution:
> 
> #ifdef HAVE_FSEEKO64
> #define FSEEK fseeko64
> #define FTELL ftello64
> #define FILE_OFFSET off64_t

We can do this, but there is the problem of making the code pretty ugly.
Also, it is not immediately clear when off_t is something to be used by
fseek and when it is being used in file offsets that will never be
seeked.  I am concerned about perhaps making things worse than they are
now.

> #else
> #ifdef HAVE_FSEEKO
> #define FSEEK fseeko
> #define FTELL ftello
> #define FILE_OFFSET off_t
> #else
> #if HAVE_FSEEK_BETTER_THAN_32_BIT
> #define FSEEK FSEEK_BETTER_THAN_32_BIT
> #define FTELL FTELL_BETTER_THAN_32_BIT
> #define FILE_OFFSET FILE_OFFSET_BETTER_THAN_32_BIT
> #else
> #if sizeof(off_t) > sizeof(long)

Can't do sizeof() tests in cpp, which is where the #if is processed.

> #define IGNORE_FSEEK
> #else
> #define FSEEK fseek
> #define FTELL ftell
> #define FILE_OFFSET long
> #end if...
> 
> Then use a correct checkSeek which also checks IGNORE_FSEEK.
> 
> AFAICT, this *will* do the job on all systems discussed. And we can 
> certainly skip the HAVE_FSEEK_BETTER_THAN_32_BIT bit, but coding a trivial 
> seek/tell pair for fsetpos/fgetpos is easy, even in a macro.

I don't think we can assume that off_t can be passed to fset/getpos
unless we know the platform supports it, unless people think fpos_t
being integral and the same size as fpos_t is enough.

Also, I don't think these can be done a macro, perhaps
fseeko(...,SEEK_SET), but not the others, and not ftello.  See
port/fseeko.c for the reason.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Bruce Momjian
Philip Warner wrote:
> 
> I just reread the patch; is it valid to assume fseek and fseeko have the 
> same  failure modes? Or does the call to 'fseek' actually call fseeko?

The fseek was a typo.  It should have been fseeko as you suggested.
CVS updated.

Your idea of using SEEK_SET is good, except I was concerned that the
checkSeek call will move the file pointer.  Is that OK?  It doesn't seem
appropriate.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Bruce Momjian

OK, finally figured it out. I had used fseek instead of fseeko.

---

Philip Warner wrote:
> At 09:56 PM 24/10/2002 -0400, Bruce Momjian wrote:
> > > > > You are quite correct. It should read:
> > > > >
> > > > >  #ifdef HAVE_FSEEKO
> > > > >   ctx->hasSeek = fseeko(...,SEEK_SET);
> 
>  ^^
> 
> 
> > > > >  #else
> > > > >   ctx->hasSeek = FALSE;
> > > > >  #endif
> 
> 
> Philip Warner| __---_
> Albatross Consulting Pty. Ltd.   |/   -  \
> (A.B.N. 75 008 659 498)  |  /(@)   __---_
> Tel: (+61) 0500 83 82 81 | _  \
> Fax: (+61) 0500 83 82 82 | ___ |
> Http://www.rhyme.com.au  |/   \|
>   |----
> PGP key available upon request,  |  /
> and from pgp5.ai.mit.edu:11371   |/
> 
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Philip Warner

I just reread the patch; is it valid to assume fseek and fseeko have the 
same  failure modes? Or does the call to 'fseek' actually call fseeko?



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Philip Warner
At 09:38 PM 24/10/2002 -0400, Bruce Momjian wrote:

OK, I am focusing on AIX now.  I don't think we can go down the road of
saying where large file support is needed or not needed.  I think for
each platform either we support large files or we don't.


Rather than having a different patch file for each platform and refusing to 
code fseek/tell because we can't do SEEK_CUR, why not check for FSEEKO64 
and revert to a simple solution:

#ifdef HAVE_FSEEKO64
#define FSEEK fseeko64
#define FTELL ftello64
#define FILE_OFFSET off64_t
#else
#ifdef HAVE_FSEEKO
#define FSEEK fseeko
#define FTELL ftello
#define FILE_OFFSET off_t
#else
#if HAVE_FSEEK_BETTER_THAN_32_BIT
#define FSEEK FSEEK_BETTER_THAN_32_BIT
#define FTELL FTELL_BETTER_THAN_32_BIT
#define FILE_OFFSET FILE_OFFSET_BETTER_THAN_32_BIT
#else
#if sizeof(off_t) > sizeof(long)
#define IGNORE_FSEEK
#else
#define FSEEK fseek
#define FTELL ftell
#define FILE_OFFSET long
#end if...

Then use a correct checkSeek which also checks IGNORE_FSEEK.

AFAICT, this *will* do the job on all systems discussed. And we can 
certainly skip the HAVE_FSEEK_BETTER_THAN_32_BIT bit, but coding a trivial 
seek/tell pair for fsetpos/fgetpos is easy, even in a macro.




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Philip Warner
At 09:56 PM 24/10/2002 -0400, Bruce Momjian wrote:

> > > You are quite correct. It should read:
> > >
> > >  #ifdef HAVE_FSEEKO
> > >   ctx->hasSeek = fseeko(...,SEEK_SET);


^^



> > >  #else
> > >   ctx->hasSeek = FALSE;
> > >  #endif



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Bruce Momjian

You are going to have to be more specific than that.

---

Philip Warner wrote:
> 
> The patch will not work. Please reread my quoted email.
> 
> At 09:32 PM 24/10/2002 -0400, Bruce Momjian wrote:
> >Philip Warner wrote:
> > >
> > > You are quite correct. It should read:
> > >
> > >  #ifdef HAVE_FSEEKO
> > >   ctx->hasSeek = fseeko(...,SEEK_SET);
> > >  #else
> > >   ctx->hasSeek = FALSE;
> > >  #endif
> > >
> > > pipes are the main case for which we are checking.
> >
> >OK, I have applied the following patch to set hasSeek only if
> >fseek/fseeko is reliable.
> 
> 
> 
> 
> Philip Warner| __---_
> Albatross Consulting Pty. Ltd.   |/   -  \
> (A.B.N. 75 008 659 498)  |  /(@)   __---_
> Tel: (+61) 0500 83 82 81 | _  \
> Fax: (+61) 0500 83 82 82 | ___ |
> Http://www.rhyme.com.au  |/   \|
>   |----
> PGP key available upon request,  |  /
> and from pgp5.ai.mit.edu:11371   |/
> 
> 
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to [EMAIL PROTECTED] so that your
> message can get through to the mailing list cleanly
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Philip Warner

The patch will not work. Please reread my quoted email.

At 09:32 PM 24/10/2002 -0400, Bruce Momjian wrote:

Philip Warner wrote:
>
> You are quite correct. It should read:
>
>  #ifdef HAVE_FSEEKO
>   ctx->hasSeek = fseeko(...,SEEK_SET);
>  #else
>   ctx->hasSeek = FALSE;
>  #endif
>
> pipes are the main case for which we are checking.

OK, I have applied the following patch to set hasSeek only if
fseek/fseeko is reliable.





Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Bruce Momjian
Zeugswetter Andreas SB SD wrote:
> 
> > The question is *which* seek APIs we need to support.  Are there any
> > besides fseeko() and fgetpos()?
> 
> On AIX we have 
> int fseeko64 (FILE* Stream, off64_t Offset, int Whence);
> which is intended for large file access for programs that do NOT
> #define _LARGE_FILES
> 
> It is functionality that is available if _LARGE_FILE_API is defined,
> which is the default if _LARGE_FILES is not defined.
> 
> That would have been my preferred way of handling large files on AIX
> in the two/three? places that need it (pg_dump/restore, psql and backend COPY).
> This would have had the advantage that off_t is not 64 bit in all other places
> where it is actually not needed, no ?

OK, I am focusing on AIX now.  I don't think we can go down the road of
saying where large file support is needed or not needed.  I think for
each platform either we support large files or we don't.  Is there a way
to have off_t be 64 bits everywhere, and if it is, why wouldn't we just
enable that rather than poke around figuring out where it is needed?

Also, I have the open item:

Fix AIX + Large File + Flex problem

Is there an AIX problem with Flex?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Bruce Momjian
Philip Warner wrote:
> At 10:08 PM 23/10/2002 -0400, Bruce Momjian wrote:
> >Well, that certainly changes the functionality of the code.  I thought
> >that fseeko test was done so that things that couldn't be seeked on were
> >detected.
> 
> You are quite correct. It should read:
> 
>  #ifdef HAVE_FSEEKO
>   ctx->hasSeek = fseeko(...,SEEK_SET);
>  #else
>   ctx->hasSeek = FALSE;
>  #endif
> 
> pipes are the main case for which we are checking.

OK, I have applied the following patch to set hasSeek only if
fseek/fseeko is reliable.  This takes care of the random failure case
for large files.  Now I need to see if I can get the custom fseeko
working for more platforms.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

Index: src/bin/pg_dump/common.c
===
RCS file: /cvsroot/pgsql-server/src/bin/pg_dump/common.c,v
retrieving revision 1.71
diff -c -c -r1.71 common.c
*** src/bin/pg_dump/common.c9 Oct 2002 16:20:25 -   1.71
--- src/bin/pg_dump/common.c25 Oct 2002 01:30:51 -
***
*** 290,296 
 * attr with the same name, then only dump it if:
 *
 * - it is NOT NULL and zero parents are NOT NULL
!*   OR 
 * - it has a default value AND the default value does not match
 *   all parent default values, or no parents specify a default.
 *
--- 290,296 
 * attr with the same name, then only dump it if:
 *
 * - it is NOT NULL and zero parents are NOT NULL
!*   OR
 * - it has a default value AND the default value does not match
 *   all parent default values, or no parents specify a default.
 *
Index: src/bin/pg_dump/pg_backup_archiver.c
===
RCS file: /cvsroot/pgsql-server/src/bin/pg_dump/pg_backup_archiver.c,v
retrieving revision 1.59
diff -c -c -r1.59 pg_backup_archiver.c
*** src/bin/pg_dump/pg_backup_archiver.c22 Oct 2002 19:15:23 -  1.59
--- src/bin/pg_dump/pg_backup_archiver.c25 Oct 2002 01:30:57 -
***
*** 2338,2343 
--- 2338,2369 
  }
  
  
+ /*
+  * checkSeek
+  *  check to see if fseek can be performed.
+  */
+ 
+ bool
+ checkSeek(FILE *fp)
+ {
+ 
+   if (fseek(fp, 0, SEEK_CUR) != 0)
+   return false;
+   else if (sizeof(off_t) > sizeof(long))
+   /*
+*  At this point, off_t is too large for long, so we return
+*  based on whether an off_t version of fseek is available.
+*/
+ #ifdef HAVE_FSEEKO
+   return true;
+ #else
+   return false;
+ #endif
+   else
+   return true;
+ }
+ 
+ 
  static void
  _SortToc(ArchiveHandle *AH, TocSortCompareFn fn)
  {
Index: src/bin/pg_dump/pg_backup_archiver.h
===
RCS file: /cvsroot/pgsql-server/src/bin/pg_dump/pg_backup_archiver.h,v
retrieving revision 1.48
diff -c -c -r1.48 pg_backup_archiver.h
*** src/bin/pg_dump/pg_backup_archiver.h22 Oct 2002 19:15:23 -  1.48
--- src/bin/pg_dump/pg_backup_archiver.h25 Oct 2002 01:30:58 -
***
*** 27,32 
--- 27,33 
  
  #include "postgres_fe.h"
  
+ #include 
  #include 
  #include 
  
***
*** 284,289 
--- 285,291 
  extern void WriteDataChunks(ArchiveHandle *AH);
  
  extern intTocIDRequired(ArchiveHandle *AH, int id, RestoreOptions *ropt);
+ extern bool checkSeek(FILE *fp);
  
  /*
   * Mandatory routines for each supported format
Index: src/bin/pg_dump/pg_backup_custom.c
===
RCS file: /cvsroot/pgsql-server/src/bin/pg_dump/pg_backup_custom.c,v
retrieving revision 1.22
diff -c -c -r1.22 pg_backup_custom.c
*** src/bin/pg_dump/pg_backup_custom.c  22 Oct 2002 19:15:23 -  1.22
--- src/bin/pg_dump/pg_backup_custom.c  25 Oct 2002 01:31:01 -
***
*** 179,185 
if (!AH->FH)
die_horribly(AH, modulename, "could not open archive file %s: 
%s\n", AH->fSpec, strerror(errno));
  
!   ctx->hasSeek = (fseeko(AH->FH, 0, SEEK_CUR) == 0);
}
else
{
--- 179,185 
if (!AH->FH)
die_horribly(AH, modulename, "could not open archive file %s: 
%s\n", AH->fSpec, strerror(errno));
  
!   ctx->hasSeek = checkSeek(AH->FH);
}
else
{
***
*** 190,196 
if (!AH->FH)
die_horr

Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Bruce Momjian
Peter Eisentraut wrote:
> Bruce Momjian writes:
> 
> > OK, NetBSD added.
> >
> > Any other OS's need this?  Is it safe for me to code something that
> > assumes fpos_t and off_t are identical?  I can't think of a good way to
> > test if two data types are identical.  I don't think sizeof is enough.
> 
> No, you can't assume that fpos_t and off_t are identical.

I was wondering --- if fpos_t and off_t are identical sizeof, and fpos_t
can do shift << or >>, that means fpos_t is also integral like off_t.
Can I then assume they are the same?

> But you can simulate a long fseeko() by calling fseek() multiple times, so
> it should be possible to write a replacement that works on all systems.

Yes, but I can't simulate ftello, so I then can't do SEEK_CUR. and if I
can't duplicate the entire API, I don't want to try.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Peter Eisentraut
Bruce Momjian writes:

> OK, NetBSD added.
>
> Any other OS's need this?  Is it safe for me to code something that
> assumes fpos_t and off_t are identical?  I can't think of a good way to
> test if two data types are identical.  I don't think sizeof is enough.

No, you can't assume that fpos_t and off_t are identical.

But you can simulate a long fseeko() by calling fseek() multiple times, so
it should be possible to write a replacement that works on all systems.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-24 Thread Zeugswetter Andreas SB SD

> The question is *which* seek APIs we need to support.  Are there any
> besides fseeko() and fgetpos()?

On AIX we have 
int fseeko64 (FILE* Stream, off64_t Offset, int Whence);
which is intended for large file access for programs that do NOT
#define _LARGE_FILES

It is functionality that is available if _LARGE_FILE_API is defined,
which is the default if _LARGE_FILES is not defined.

That would have been my preferred way of handling large files on AIX
in the two/three? places that need it (pg_dump/restore, psql and backend COPY).
This would have had the advantage that off_t is not 64 bit in all other places
where it is actually not needed, no ?

Andreas

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian

OK, NetBSD added.  

Any other OS's need this?  Is it safe for me to code something that
assumes fpos_t and off_t are identical?  I can't think of a good way to
test if two data types are identical.  I don't think sizeof is enough.

---

Giles Lean wrote:
> 
> > OK, does pre-1.6 NetBSD have fgetpos/fsetpos that is off_t/quad?
> 
> Yes:
> 
> int
> fgetpos(FILE *stream, fpos_t *pos);
> 
> int
> fsetpos(FILE *stream, const fpos_t *pos);
> 
> Per comments in  fpos_t is the same format as off_t, and
> off_t and fpos_t have been 64 bit since 1994.
> 
> http://cvsweb.netbsd.org/bsdweb.cgi/basesrc/include/stdio.h
> 
> Regards,
> 
> Giles
> 
> 
> 
> 
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian

Looks like I have some more work to do.  Thanks.

---

Giles Lean wrote:
> 
> > OK, does pre-1.6 NetBSD have fgetpos/fsetpos that is off_t/quad?
> 
> Yes:
> 
> int
> fgetpos(FILE *stream, fpos_t *pos);
> 
> int
> fsetpos(FILE *stream, const fpos_t *pos);
> 
> Per comments in  fpos_t is the same format as off_t, and
> off_t and fpos_t have been 64 bit since 1994.
> 
> http://cvsweb.netbsd.org/bsdweb.cgi/basesrc/include/stdio.h
> 
> Regards,
> 
> Giles
> 
> 
> 
> 
> 
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to [EMAIL PROTECTED] so that your
> message can get through to the mailing list cleanly
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Giles Lean

> OK, does pre-1.6 NetBSD have fgetpos/fsetpos that is off_t/quad?

Yes:

int
fgetpos(FILE *stream, fpos_t *pos);

int
fsetpos(FILE *stream, const fpos_t *pos);

Per comments in  fpos_t is the same format as off_t, and
off_t and fpos_t have been 64 bit since 1994.

http://cvsweb.netbsd.org/bsdweb.cgi/basesrc/include/stdio.h

Regards,

Giles





---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian
Philip Warner wrote:
> At 10:03 PM 23/10/2002 -0400, Bruce Momjian wrote:
> >It is much cleaner to just duplicate the entire API so you don't have
> >any limitations or failure cases.
> 
> We may still end up using macros in pg_dump to cope with cases where off_t 
> & fseeko are not defined - if there are any. I presume we would then just 
> revert to calling fseek/ftell etc.

Well, we have fseeko falling back to fseek already, so that is working
fine.  I don't think we will find any OS's without off_t.  We just need
a little smarts.  Let me see if I can work on it now.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Philip Warner
At 10:03 PM 23/10/2002 -0400, Bruce Momjian wrote:

It is much cleaner to just duplicate the entire API so you don't have
any limitations or failure cases.


We may still end up using macros in pg_dump to cope with cases where off_t 
& fseeko are not defined - if there are any. I presume we would then just 
revert to calling fseek/ftell etc.





Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Philip Warner
At 10:08 PM 23/10/2002 -0400, Bruce Momjian wrote:

Well, that certainly changes the functionality of the code.  I thought
that fseeko test was done so that things that couldn't be seeked on were
detected.


You are quite correct. It should read:

#ifdef HAVE_FSEEKO
 ctx->hasSeek = fseeko(...,SEEK_SET);
#else
 ctx->hasSeek = FALSE;
#endif

pipes are the main case for which we are checking.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian

Well, that certainly changes the functionality of the code.  I thought
that fseeko test was done so that things that couldn't be seeked on were
detected.  Not sure what isn't seek-able, maybe named pipes.  I thought
it was testing that so I didn't touch that variable.

This was my original thought, that we have non-fseeko code in place. 
Can we just trigger the non-fseeko code on HAS_FSEEKO.  The code would
be something like:

if (sizeof(long) >= sizeof(off_t))
ctx->hasSeek = TRUE;
else
#ifdef HAVE_FSEEKO
 ctx->hasSeek = TRUE;
#else
 ctx->hasSeek = FALSE;
#endif

---

Philip Warner wrote:
> At 11:55 AM 24/10/2002 +1000, Philip Warner wrote:
> 
> >The only code that uses SEEK_CUR is the code to check if seek is available 
> >- I am ver happy to change that to SEEK_SET - I can't even recall why I 
> >used SEEK_CUR. The code that does the real seeks uses SEEK_SET.
> 
> Come to think of it:
> 
>  ctx->hasSeek = (fseeko(AH->FH, 0, SEEK_CUR) == 0);
> 
> should be replaced by:
> 
> #ifdef HAS_FSEEK[O]
>  ctx->hasSeek = TRUE;
> #else
>  ctx->hasSeek = FALSE;
> #endif
> 
> Since we're now checking for it in configure, we should remove the checks 
> from the pg_dump code.
> 
> 
> 
> 
> 
> Philip Warner| __---_
> Albatross Consulting Pty. Ltd.   |/   -  \
> (A.B.N. 75 008 659 498)  |  /(@)   __---_
> Tel: (+61) 0500 83 82 81 | _  \
> Fax: (+61) 0500 83 82 82 | ___ |
> Http://www.rhyme.com.au  |/   \|
>   |----
> PGP key available upon request,  |  /
> and from pgp5.ai.mit.edu:11371   |/
> 
> 
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/users-lounge/docs/faq.html
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian
Philip Warner wrote:
> At 09:45 PM 23/10/2002 -0400, Bruce Momjian wrote:
> >We have to write another function because fsetpos doesn't do SEEK_CUR so
> >you have to implement it with more complex code.  It isn't a drop in
> >place thing.
> 
> The only code that uses SEEK_CUR is the code to check if seek is available 
> - I am ver happy to change that to SEEK_SET - I can't even recall why I 
> used SEEK_CUR. The code that does the real seeks uses SEEK_SET.

There are other problems.  fgetpos() expects a pointer to an fpos_t,
while ftello just returns off_t, so you need a local variable in the
function to pass to fgetpos() and they return that from the function.

It is much cleaner to just duplicate the entire API so you don't have
any limitations or failure cases.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Philip Warner
At 11:55 AM 24/10/2002 +1000, Philip Warner wrote:


The only code that uses SEEK_CUR is the code to check if seek is available 
- I am ver happy to change that to SEEK_SET - I can't even recall why I 
used SEEK_CUR. The code that does the real seeks uses SEEK_SET.

Come to think of it:

ctx->hasSeek = (fseeko(AH->FH, 0, SEEK_CUR) == 0);

should be replaced by:

#ifdef HAS_FSEEK[O]
ctx->hasSeek = TRUE;
#else
ctx->hasSeek = FALSE;
#endif

Since we're now checking for it in configure, we should remove the checks 
from the pg_dump code.





Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Philip Warner
At 09:45 PM 23/10/2002 -0400, Bruce Momjian wrote:

We have to write another function because fsetpos doesn't do SEEK_CUR so
you have to implement it with more complex code.  It isn't a drop in
place thing.


The only code that uses SEEK_CUR is the code to check if seek is available 
- I am ver happy to change that to SEEK_SET - I can't even recall why I 
used SEEK_CUR. The code that does the real seeks uses SEEK_SET.




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian
Philip Warner wrote:
> At 09:41 PM 23/10/2002 -0400, Bruce Momjian wrote:
> >If we get this, everything is fine.  I have done that for BSD/OS today.
> >I may need to do the same for NetBSD/OpenBSD too.
> 
> What did you do to achieve this?

See src/port/fseeko.c in current CVS, with some configure.in glue.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Philip Warner
At 09:41 PM 23/10/2002 -0400, Bruce Momjian wrote:

If we get this, everything is fine.  I have done that for BSD/OS today.
I may need to do the same for NetBSD/OpenBSD too.


What did you do to achieve this?



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Philip Warner
At 09:36 PM 23/10/2002 -0400, Bruce Momjian wrote:

We are going to need to either get fseeko workarounds for
those, or disable those features in a meaningful way.


? if we have not got a 64 bit seek function of any kind, then use a 32 
bit seek - the features don't need to be disabled. AFAICT, this is a 
non-issue: no 64 bit seek means no large files.

I'm not sure we should even worry about it, but if you are genuinely 
concerned that we have no 64 bit seek call, but we do have files > 4GB, 
then If you really want to disable seek, just modify the code that sets 
'hasSeek' - don't screw around with every seek call. But only modify clear 
it if the file is > 4GB.







Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian
Philip Warner wrote:
> At 11:50 PM 23/10/2002 +0200, Peter Eisentraut wrote:
> 
> >1. Disable access to large files.
> >
> >2. Seek in some other way.
> 
> This gets my vote, but I would like to see a clean implementation (not huge 
> quantities if ifdefs every time we call fseek); either we write our own 
> fseek as Bruce seems to be suggesting, or we have a single header file that 
> defines the FSEEK/FTELL/OFF_T to point to the 'right' functions, where 
> 'right' is defined as 'most likely to generate an integer and which makes 
> use of the largest number of bytes'.

We have to write another function because fsetpos doesn't do SEEK_CUR so
you have to implement it with more complex code.  It isn't a drop in
place thing.

> The way the code is currently written it does not matter if this is a 16 or 
> 3 byte value - so long as it is an integer.

Right. What we are assuming now is that off_t can be seeked using
whatever we defined for fseeko, which is incorrect in one, and now I
hear more than one OS.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Philip Warner
At 11:50 PM 23/10/2002 +0200, Peter Eisentraut wrote:


1. Disable access to large files.

2. Seek in some other way.


This gets my vote, but I would like to see a clean implementation (not huge 
quantities if ifdefs every time we call fseek); either we write our own 
fseek as Bruce seems to be suggesting, or we have a single header file that 
defines the FSEEK/FTELL/OFF_T to point to the 'right' functions, where 
'right' is defined as 'most likely to generate an integer and which makes 
use of the largest number of bytes'.

The way the code is currently written it does not matter if this is a 16 or 
3 byte value - so long as it is an integer.




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian
Philip Warner wrote:
> At 10:42 AM 23/10/2002 -0400, Bruce Momjian wrote:
> >What I am concerned about are cases that fail at runtime, specifically
> >during a restore of a >2gig file.
> 
> Please give an example that would still apply assuming we get a working 
> seek/tell pair that works with whatever we use as an offset?

If we get this, everything is fine.  I have done that for BSD/OS today. 
I may need to do the same for NetBSD/OpenBSD too.

> If you are concerned about reading a dump file with 8 byte offsets on a 
> machine with 4 byte off_t, that case and it's permutations are already covered.

No, I know that is covered because it will report a proper error message
on the restore on the 4-byte off_t machine.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Philip Warner
At 10:42 AM 23/10/2002 -0400, Bruce Momjian wrote:

What I am concerned about are cases that fail at runtime, specifically
during a restore of a >2gig file.


Please give an example that would still apply assuming we get a working 
seek/tell pair that works with whatever we use as an offset?

If you are concerned about reading a dump file with 8 byte offsets on a 
machine with 4 byte off_t, that case and it's permutations are already covered.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian
Giles Lean wrote:
> 
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> 
> > OK, well BSD/OS now works, but I wonder if there are any other quad
> > off_t OS's out there without fseeko.
> 
> NetBSD prior to 1.6, released September 14, 2002. (Source: CVS logs.)

OK, does pre-1.6 NetBSD have fgetpos/fsetpos that is off_t/quad?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian
Philip Warner wrote:
> At 05:50 PM 23/10/2002 -0400, Bruce Momjian wrote:
> >Looking at the pg_dump code, it seems the fseeks are optional in there
> >anyway because it already has code to read the file sequentially rather
> 
> But there are features that are not available if it can't seek: eg. it will 
> not restore in a different order to that in which it was written; it will 
> not dump data offsets in the TOC so dump files can not be restored in 
> alternate orders; restore times will be large for a single table (it has to 
> read the entire file potentially).

OK, that helps.  We just got a list of 2 other OS's without fseeko and
with large file support.  Any NetBSD before Auguest 2002 has that
problem.  We are going to need to either get fseeko workarounds for
those, or disable those features in a meaningful way.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Philip Warner
At 05:50 PM 23/10/2002 -0400, Bruce Momjian wrote:

Looking at the pg_dump code, it seems the fseeks are optional in there
anyway because it already has code to read the file sequentially rather


But there are features that are not available if it can't seek: eg. it will 
not restore in a different order to that in which it was written; it will 
not dump data offsets in the TOC so dump files can not be restored in 
alternate orders; restore times will be large for a single table (it has to 
read the entire file potentially).



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Giles Lean

Bruce Momjian <[EMAIL PROTECTED]> writes:

> OK, well BSD/OS now works, but I wonder if there are any other quad
> off_t OS's out there without fseeko.

NetBSD prior to 1.6, released September 14, 2002. (Source: CVS logs.)

OpenBSD prior to 2.7, released June 15, 2000.  (Source: release notes.)

FreeBSD has had fseeko() for some time, but I'm not sure which release
introduced it -- perhaps 3.2.0, released May, 1999. (Source: CVS logs.)

Regards,

Giles





---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian
Tom Lane wrote:
> Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > First we need to decide what we want to happen and after that think about
> > how to implement it.  Given sizeof(off_t) > sizeof(long) and no fseeko(),
> > we have the following options:
> 
> It seems obvious to me that there are no platforms that offer
> sizeof(off_t) > sizeof(long) but have no API for doing seeks with off_t.
> That would be just plain silly.  IMHO it's acceptable for us to fail at
> configure time if we can't figure out how to seek.

I would certainly be happy failing at configure time, so we know at the
start what is broken, rather than failures during restore.

> The question is *which* seek APIs we need to support.  Are there any
> besides fseeko() and fgetpos()?

What I have added is BSD/OS specific because only on BSD/OS do I know
fpos_t and off_t are the same type.  If we come up with other platforms,
we will have to deal with it then.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> How would we disable access to large files?

I think configure should fail if it can't find a way to seek.
Workaround for anyone in that situation is configure --disable-largefile.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> First we need to decide what we want to happen and after that think about
> how to implement it.  Given sizeof(off_t) > sizeof(long) and no fseeko(),
> we have the following options:

It seems obvious to me that there are no platforms that offer
sizeof(off_t) > sizeof(long) but have no API for doing seeks with off_t.
That would be just plain silly.  IMHO it's acceptable for us to fail at
configure time if we can't figure out how to seek.

The question is *which* seek APIs we need to support.  Are there any
besides fseeko() and fgetpos()?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian
Peter Eisentraut wrote:
> Bruce Momjian writes:
> 
> > I think you are right that we have to not use off_t and use long if we
> > can't find a proper 64-bit seek function, but what are the failure modes
> > of doing this?  Exactly what happens for larger files?
> 
> First we need to decide what we want to happen and after that think about
> how to implement it.  Given sizeof(off_t) > sizeof(long) and no fseeko(),
> we have the following options:
> 
> 1. Disable access to large files.
> 
> 2. Seek in some other way.
> 
> What's it gonna be?

OK, well BSD/OS now works, but I wonder if there are any other quad
off_t OS's out there without fseeko.

How would we disable access to large files?  Do we fstat the file and
see if it is too large?   I suppose we are looking for cases where the
file system has large files, but fseeko doesn't allow us to access them.
Should we leave this issue alone and wait to find another OS with this
problem, and we can then rejigger fseeko.c to handle that OS too?

Looking at the pg_dump code, it seems the fseeks are optional in there
anyway because it already has code to read the file sequentially rather
than use fseek, and the TOC case in pg_backup_custom.c says that is
optional too.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Peter Eisentraut
Bruce Momjian writes:

> I think you are right that we have to not use off_t and use long if we
> can't find a proper 64-bit seek function, but what are the failure modes
> of doing this?  Exactly what happens for larger files?

First we need to decide what we want to happen and after that think about
how to implement it.  Given sizeof(off_t) > sizeof(long) and no fseeko(),
we have the following options:

1. Disable access to large files.

2. Seek in some other way.

What's it gonna be?

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian
Philip Warner wrote:
> At 01:02 AM 23/10/2002 -0400, Bruce Momjian wrote:
> 
> >OK, you are saying if we don't have fseeko(), there is no reason to use
> >off_t, and we may as well use long.  What limitations does that impose,
> >and are the limitations clear to the user.
> 
> What I'm saying is that if we have not got fseeko then we should use any 
> 'seek-class' function that returns a 64 bit value. We have already made the 
> assumption that off_t is an integer; the same logic that came to that 
> conclusion, applies just as validly to the other seek functions.
> 
> Secondly, if there is no 64 bit 'seek-class' function, then we should 
> probably use a size_t, but a long would probably be fine too. I am not 
> particularly attached to this part; long, int etc etc. Whatever is most 
> likely to return an integer and work with whatever function we choose.
> 
> As to implications: assuming they are all integers (which as you know I 
> don't like), we should have no problems.
> 
> If a system does not have any function to access 64 bit file offsets, then 
> I'd say they are pretty unlikely to have files > 2GB.

Let me see if I can be clearer.  With shifting off_t, if that fails, we
will find out right away, at compile time.  I think that is acceptable.

What I am concerned about are cases that fail at runtime, specifically
during a restore of a >2gig file.  In my reading of the code, those
failures will be silent or will produce unusual error messages.  I don't
think we can ship code that has strange failure modes for data restore.

Now, if someone knows those failure cases, I would love to hear about
it.  If not, I will dig into the code today and find out where they are.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-23 Thread Bruce Momjian
Philip Warner wrote:
> At 01:02 AM 23/10/2002 -0400, Bruce Momjian wrote:
> 
> >OK, you are saying if we don't have fseeko(), there is no reason to use
> >off_t, and we may as well use long.  What limitations does that impose,
> >and are the limitations clear to the user.
> 
> What I'm saying is that if we have not got fseeko then we should use any 
> 'seek-class' function that returns a 64 bit value. We have already made the 
> assumption that off_t is an integer; the same logic that came to that 
> conclusion, applies just as validly to the other seek functions.

Oh, I see, so try to use fsetpos/fgetpos?  I can write wrappers for
those to look like fgetpos/fsetpos and put it in /port.

> Secondly, if there is no 64 bit 'seek-class' function, then we should 
> probably use a size_t, but a long would probably be fine too. I am not 
> particularly attached to this part; long, int etc etc. Whatever is most 
> likely to return an integer and work with whatever function we choose.
> 
> As to implications: assuming they are all integers (which as you know I 
> don't like), we should have no problems.
> 
> If a system does not have any function to access 64 bit file offsets, then 
> I'd say they are pretty unlikely to have files > 2GB.

OK, my OS can handle 64-bit files, but has only fgetpos/fsetpos, so I
could get that working.  The bigger question is what about OS's that
have 64-bit off_t/files but don't have any seek-type functions.  I did
research to find mine, but what about others that may have other
variants?

I think you are right that we have to not use off_t and use long if we
can't find a proper 64-bit seek function, but what are the failure modes
of doing this?  Exactly what happens for larger files?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Philip Warner
At 01:02 AM 23/10/2002 -0400, Bruce Momjian wrote:


OK, you are saying if we don't have fseeko(), there is no reason to use
off_t, and we may as well use long.  What limitations does that impose,
and are the limitations clear to the user.


What I'm saying is that if we have not got fseeko then we should use any 
'seek-class' function that returns a 64 bit value. We have already made the 
assumption that off_t is an integer; the same logic that came to that 
conclusion, applies just as validly to the other seek functions.

Secondly, if there is no 64 bit 'seek-class' function, then we should 
probably use a size_t, but a long would probably be fine too. I am not 
particularly attached to this part; long, int etc etc. Whatever is most 
likely to return an integer and work with whatever function we choose.

As to implications: assuming they are all integers (which as you know I 
don't like), we should have no problems.

If a system does not have any function to access 64 bit file offsets, then 
I'd say they are pretty unlikely to have files > 2GB.





Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Bruce Momjian

OK, you are saying if we don't have fseeko(), there is no reason to use
off_t, and we may as well use long.  What limitations does that impose,
and are the limitations clear to the user.

What has me confused is that I only see two places that use a non-zero
fseeko, and in those cases, there is a non-fseeko code path that does
the same thing, or the call isn't actually required.  Both cases are in
pg_dump/pg_dump_custom.c.  It appears seeking in the file is an
optimization that prevents all the blocks from being read.  That is
fine, but we shouldn't introduce failure cases to do that.

If BSD/OS is the only problem OS, I can deal with that, but I have no
idea if other OS's have the same limitation, and because of the way our
code exists now, we are not even checking to see if there is a problem.

I did some poking around, and on BSD/OS, fgetpos/fsetpos use fpos_t,
which is actually off_t, and interestingly, lseek() uses off_t too. 
Seems only fseek/ftell is limited to long.  I can easily implemnt
fseeko/ftello using fgetpos/fsetpos, but that is only one OS.

One idea would be to patch up BSD/OS in backend/port/bsdi and add a
configure tests that actually fails if fseeko doesn't exist _and_
sizeof(off_t) > sizeof(long).  That would at least catch OS's before
they make >2gig backups that can't be restored.

---

Philip Warner wrote:
> At 10:46 PM 22/10/2002 -0400, Bruce Momjian wrote:
> >Uh, not exactly.  I have off_t as a quad, and I don't have fseeko, so
> >the above conditional doesn't work. I want to use off_t, but can't use
> >fseek().
> 
> Then when you create dumps, they will be invalid since I assume that ftello 
> is also broken in the same way. You need to fix _getFilePos as well. And 
> any other place that uses an off_t needs to be looked at very carefully. 
> The code was written assuming that if 'hasSeek' was set, then we could 
> trust it.
> 
> Given that you say you do have support for some kind of 64 bt offset, I 
> would be a lot happier with these changes if you did something akin to my 
> original sauggestion:
> 
> #if defined(HAVE_FSEEKO)
> #define FILE_OFFSET off_t
> #define FSEEK fseeko
> #elseif defined(HAVE_SOME_OTHER_FSEEK)
> #define FILE_OFFSET some_other_offset
> #define FSEEK some_other_fseek
> #else
> #define FILE_OFFSET long
> #define FSEEK fseek
> #end if
> 
> ...assuming you have a non-broken 64 bit fseek/tell pair, then this will 
> work in all cases, and make the code a lot less ugly (assuming of course 
> the non-broken version can be shifted).
> 
> 
> 
> 
> Philip Warner| __---_
> Albatross Consulting Pty. Ltd.   |/   -  \
> (A.B.N. 75 008 659 498)  |  /(@)   __---_
> Tel: (+61) 0500 83 82 81 | _  \
> Fax: (+61) 0500 83 82 82 | ___ |
> Http://www.rhyme.com.au  |/   \|
>   |----
> PGP key available upon request,  |  /
> and from pgp5.ai.mit.edu:11371   |/
> 
> 
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Philip Warner
At 12:29 AM 23/10/2002 -0400, Bruce Momjian wrote:

This fseeko/ftello/off_t is just too fluid, and the
failure modes too serious.


I agree. Can you think of a better solution than the one I suggested???



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Philip Warner
At 12:32 AM 23/10/2002 -0400, Tom Lane wrote:

I am wondering why pg_dump has to depend on either fseek or ftell.


It doesn't - it just works better and has more features if they are 
available, much like zlib etc.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> I wonder if any other platforms have this limitation.  I think we need
> to add some type of test for no-fseeko()/ftello() and sizeof(off_t) >
> sizeof(long).  This fseeko/ftello/off_t is just too fluid, and the
> failure modes too serious.

I am wondering why pg_dump has to depend on either fseek or ftell.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Bruce Momjian

Sounds messy.  Let me see if I can code up an fseeko/ftello for BSD/OS
and add that to /port.  No reason to hold up beta for that, though.

I wonder if any other platforms have this limitation.  I think we need
to add some type of test for no-fseeko()/ftello() and sizeof(off_t) >
sizeof(long).  This fseeko/ftello/off_t is just too fluid, and the
failure modes too serious.

---

Philip Warner wrote:
> At 10:46 PM 22/10/2002 -0400, Bruce Momjian wrote:
> >Uh, not exactly.  I have off_t as a quad, and I don't have fseeko, so
> >the above conditional doesn't work. I want to use off_t, but can't use
> >fseek().
> 
> Then when you create dumps, they will be invalid since I assume that ftello 
> is also broken in the same way. You need to fix _getFilePos as well. And 
> any other place that uses an off_t needs to be looked at very carefully. 
> The code was written assuming that if 'hasSeek' was set, then we could 
> trust it.
> 
> Given that you say you do have support for some kind of 64 bt offset, I 
> would be a lot happier with these changes if you did something akin to my 
> original sauggestion:
> 
> #if defined(HAVE_FSEEKO)
> #define FILE_OFFSET off_t
> #define FSEEK fseeko
> #elseif defined(HAVE_SOME_OTHER_FSEEK)
> #define FILE_OFFSET some_other_offset
> #define FSEEK some_other_fseek
> #else
> #define FILE_OFFSET long
> #define FSEEK fseek
> #end if
> 
> ...assuming you have a non-broken 64 bit fseek/tell pair, then this will 
> work in all cases, and make the code a lot less ugly (assuming of course 
> the non-broken version can be shifted).
> 
> 
> 
> 
> Philip Warner| __---_
> Albatross Consulting Pty. Ltd.   |/   -  \
> (A.B.N. 75 008 659 498)  |  /(@)   __---_
> Tel: (+61) 0500 83 82 81 | _  \
> Fax: (+61) 0500 83 82 82 | ___ |
> Http://www.rhyme.com.au  |/   \|
>   |----
> PGP key available upon request,  |  /
> and from pgp5.ai.mit.edu:11371   |/
> 
> 
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Philip Warner
At 10:46 PM 22/10/2002 -0400, Bruce Momjian wrote:

Uh, not exactly.  I have off_t as a quad, and I don't have fseeko, so
the above conditional doesn't work. I want to use off_t, but can't use
fseek().


Then when you create dumps, they will be invalid since I assume that ftello 
is also broken in the same way. You need to fix _getFilePos as well. And 
any other place that uses an off_t needs to be looked at very carefully. 
The code was written assuming that if 'hasSeek' was set, then we could 
trust it.

Given that you say you do have support for some kind of 64 bt offset, I 
would be a lot happier with these changes if you did something akin to my 
original sauggestion:

#if defined(HAVE_FSEEKO)
#define FILE_OFFSET off_t
#define FSEEK fseeko
#elseif defined(HAVE_SOME_OTHER_FSEEK)
#define FILE_OFFSET some_other_offset
#define FSEEK some_other_fseek
#else
#define FILE_OFFSET long
#define FSEEK fseek
#end if

...assuming you have a non-broken 64 bit fseek/tell pair, then this will 
work in all cases, and make the code a lot less ugly (assuming of course 
the non-broken version can be shifted).




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Bruce Momjian
Philip Warner wrote:
> At 05:37 PM 22/10/2002 -0400, Bruce Momjian wrote:
> >!   if (ctx->hasSeek
> >! #if !defined(HAVE_FSEEKO)
> >!   && sizeof(off_t) <= sizeof(long)
> >! #endif
> >!   )
> 
> Just to clarify my understanding:
> 
> - HAVE_FSEEKO is tested & defined in configure
> - If it is not defined, then all calls to fseeko will magically be 
> translated to fseek calls, and use the 'long' parameter type.
> 
> Is that right?
> 
> If so, why don't we:
> 
> #if defined(HAVE_FSEEKO)
> #define FILE_OFFSET off_t
> #define FSEEK fseeko
> #else
> #define FILE_OFFSET long
> #define FSEEK fseek
> #end if
> 
> then replace all refs to off_t with FILE_OFFSET, and fseeko with FSEEK.
> 
> Existing checks etc will then refuse to load file offsets with significant 
> bytes after the 4th byte, we will still use fseek/o in broken OS 
> implementations of off_t.

Uh, not exactly.  I have off_t as a quad, and I don't have fseeko, so
the above conditional doesn't work. I want to use off_t, but can't use
fseek().  As it turns out, the code already has options to handle no
fseek, so it seems to work anyway.  I think what you miss may be the
table of contents in the archive, if I am reading the code correctly.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Philip Warner
At 05:37 PM 22/10/2002 -0400, Bruce Momjian wrote:

!   if (ctx->hasSeek
! #if !defined(HAVE_FSEEKO)
!   && sizeof(off_t) <= sizeof(long)
! #endif
!   )


Just to clarify my understanding:

- HAVE_FSEEKO is tested & defined in configure
- If it is not defined, then all calls to fseeko will magically be 
translated to fseek calls, and use the 'long' parameter type.

Is that right?

If so, why don't we:

#if defined(HAVE_FSEEKO)
#define FILE_OFFSET off_t
#define FSEEK fseeko
#else
#define FILE_OFFSET long
#define FSEEK fseek
#end if

then replace all refs to off_t with FILE_OFFSET, and fseeko with FSEEK.

Existing checks etc will then refuse to load file offsets with significant 
bytes after the 4th byte, we will still use fseek/o in broken OS 
implementations of off_t.




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Bruce Momjian
Peter Eisentraut wrote:
> Bruce Momjian writes:
> 
> > I am concerned about one more thing.  On BSD/OS, we have off_t of quad
> > (8 byte), but we don't have fseeko, so this call looks questionable:
> >
> > if (fseeko(AH->FH, tctx->dataPos, SEEK_SET) != 0)
> 
> Maybe you want to ask your OS provider how the heck this is supposed to
> work.  I mean, it's great to have wide types, but what's the point if the
> API can't handle them?

Excellent question.  They do have fsetpos/fgetpos, and I think they
think you are supposed to use those.  However, they don't do seek from
current position, and they don't take an off_t, so I am confused myself.

I did ask on the mailing list and everyone kind of agreed it was a
missing feature.  However, because of the way we call fseeko not knowing
if it is a quad or a long, I think we have to add the checks to prevent
such wild seeks from happening.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Peter Eisentraut
Bruce Momjian writes:

> I am concerned about one more thing.  On BSD/OS, we have off_t of quad
> (8 byte), but we don't have fseeko, so this call looks questionable:
>
>   if (fseeko(AH->FH, tctx->dataPos, SEEK_SET) != 0)

Maybe you want to ask your OS provider how the heck this is supposed to
work.  I mean, it's great to have wide types, but what's the point if the
API can't handle them?

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Bruce Momjian
Bruce Momjian wrote:
> > So I think we're wasting our time to debate whether we need to support
> > non-integral off_t ... let's just apply Bruce's version and wait to
> > see if anyone has a problem before doing more work.
> 
> I am concerned about one more thing.  On BSD/OS, we have off_t of quad
> (8 byte), but we don't have fseeko, so this call looks questionable:
> 
>   if (fseeko(AH->FH, tctx->dataPos, SEEK_SET) != 0)
> 
> In this case, dataPos is off_t (8 bytes), while fseek only accepts long
> in that parameter (4 bytes).  When this code is hit, a file > 4 gigs
> will seek to the wrong offset, I am afraid.  Also, I don't understand
> why the compiler doesn't produce a warning.
> 
> I wonder if I should add a conditional test so this code is hit only if
> HAVE_FSEEKO is defined.  There is alternative code for all the non-zero
> fseeks.

Here is a patch that I think fixes the problem I outlined above.  If
there is no fseeko(), it will not call fseek with a non-zero offset
unless sizeof(off_t) <= sizeof(long).

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

Index: src/bin/pg_dump/pg_backup_custom.c
===
RCS file: /cvsroot/pgsql-server/src/bin/pg_dump/pg_backup_custom.c,v
retrieving revision 1.22
diff -c -c -r1.22 pg_backup_custom.c
*** src/bin/pg_dump/pg_backup_custom.c  22 Oct 2002 19:15:23 -  1.22
--- src/bin/pg_dump/pg_backup_custom.c  22 Oct 2002 21:36:30 -
***
*** 431,437 
if (tctx->dataState == K_OFFSET_NO_DATA)
return;
  
!   if (!ctx->hasSeek || tctx->dataState == K_OFFSET_POS_NOT_SET)
{
/* Skip over unnecessary blocks until we get the one we want. */
  
--- 431,441 
if (tctx->dataState == K_OFFSET_NO_DATA)
return;
  
!   if (!ctx->hasSeek || tctx->dataState == K_OFFSET_POS_NOT_SET
! #if !defined(HAVE_FSEEKO)
!   || sizeof(off_t) > sizeof(long)
! #endif
!   )
{
/* Skip over unnecessary blocks until we get the one we want. */
  
***
*** 809,815 
 * be ok to just use the existing self-consistent block
 * formatting.
 */
!   if (ctx->hasSeek)
{
fseeko(AH->FH, tpos, SEEK_SET);
WriteToc(AH);
--- 813,823 
 * be ok to just use the existing self-consistent block
 * formatting.
 */
!   if (ctx->hasSeek
! #if !defined(HAVE_FSEEKO)
!   && sizeof(off_t) <= sizeof(long)
! #endif
!   )
{
fseeko(AH->FH, tpos, SEEK_SET);
WriteToc(AH);


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Bruce Momjian

Patch applied with shift <> changes by me.  Thanks.

---


Philip Warner wrote:
> 
> I have put the latest patch at:
> 
>  http://downloads.rhyme.com.au/postgresql/pg_dump/
> 
> along with two dump files of the regression DB, one with 4 byte
> and the other with 8 byte offsets. I can read/restore each from
> the other, so it looks pretty good. Once the endianness is tested,
> we should be OK.
> 
> Known problems:
> 
> - will not cope with > 4GB files and size_t not 64 bit.
> - when printing data position, it is assumed that off_t is UINT64
>(we could remove this entirely - it's just for display)
> - if seek is not supported, then an intXX is assigned to off_t
>when file offsets are needed. This *should* not cause a problem
>since without seek, the offsets will not be written to the file.
> 
> Changes from Prior Version:
> 
> - No longer stores or outputs data length
> - Assumes result of ftello is correct if it disagrees with internally
>kept tally.
> - 'pg_restore -l' now shows sizes of int and offset.
> 
> 
> 
> Philip Warner| __---_
> Albatross Consulting Pty. Ltd.   |/   -  \
> (A.B.N. 75 008 659 498)  |  /(@)   __---_
> Tel: (+61) 0500 83 82 81 | _  \
> Fax: (+61) 0500 83 82 82 | ___ |
> Http://www.rhyme.com.au  |/   \|
>   |----
> PGP key available upon request,  |  /
> and from pgp5.ai.mit.edu:11371   |/
> 
> 
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Your version will break more often because we are assuming we can
> determine the endian-ness of the OS, _and_ for quad off_t types,
> assuming we know that is stored the same too.  While we have ending for
> int's, I have no idea if quads are always stored the same.

There is precedent for problems of that ilk, too, cf PDP_ENDIAN: years
ago someone made double-word-integer software routines and did not
think twice about which word should appear first in storage, with the
consequence that the storage order was neither little-endian nor
big-endian.  (We have exactly the same issue with our CRC routines for
compilers without int64: the two-int32 struct is defined in a way that's
compatible with little-endian storage, and on a big-endian machine it'll
produce a funny storage order.)

Unless someone can point to a supported (or potentially interesting)
platform on which off_t is indeed not integral, I think the shift-based
code is our safest bet.  (The precedent of the off_t checking code in
configure makes me really doubt that there are any platforms with
non-integral off_t.)

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > However, since we don't know if we support any non-integral off_t
> > platforms, and because a configure test would require us to have two
> > code paths for with/without integral off_t, I suggest we apply my
> > version of Philip's patch and let's see if everyone can compile it
> > cleanly.
> 
> Actually, it looks to me like configure will spit up if off_t is not
> an integral type:
> 
>  /* Check that off_t can represent 2**63 - 1 correctly.
> We can't simply define LARGE_OFF_T to be 9223372036854775807,
> since some C++ compilers masquerading as C compilers
> incorrectly reject 9223372036854775807.  */
> #define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
>   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
>  && LARGE_OFF_T % 2147483647 == 1)
> ? 1 : -1];
> 
> So I think we're wasting our time to debate whether we need to support
> non-integral off_t ... let's just apply Bruce's version and wait to
> see if anyone has a problem before doing more work.

I am concerned about one more thing.  On BSD/OS, we have off_t of quad
(8 byte), but we don't have fseeko, so this call looks questionable:

if (fseeko(AH->FH, tctx->dataPos, SEEK_SET) != 0)

In this case, dataPos is off_t (8 bytes), while fseek only accepts long
in that parameter (4 bytes).  When this code is hit, a file > 4 gigs
will seek to the wrong offset, I am afraid.  Also, I don't understand
why the compiler doesn't produce a warning.

I wonder if I should add a conditional test so this code is hit only if
HAVE_FSEEKO is defined.  There is alternative code for all the non-zero
fseeks.

Comments?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Bruce Momjian
Philip Warner wrote:
> At 12:00 PM 22/10/2002 -0400, Bruce Momjian wrote:
> >It does have the advantage of being more portable on systems
> >that do have integral off_t
> 
> I suspect it is no more portable than determining storage order by using 
> 'int i = 256', then writing in storage order, and has the disadvantage that 
> it may break as discussed.
> 
> AFAICT, using storage order will not break under any circumstances within 
> one OS/architecture (unlike using shift), and will not break any more often 
> than using shift in cases where off_t is integral.

Your version will break more often because we are assuming we can
determine the endian-ness of the OS, _and_ for quad off_t types,
assuming we know that is stored the same too.  While we have ending for
int's, I have no idea if quads are always stored the same.  By accessing
it as an integral type, we make certain it is output the same way every
time for every OS.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> However, since we don't know if we support any non-integral off_t
> platforms, and because a configure test would require us to have two
> code paths for with/without integral off_t, I suggest we apply my
> version of Philip's patch and let's see if everyone can compile it
> cleanly.

Actually, it looks to me like configure will spit up if off_t is not
an integral type:

 /* Check that off_t can represent 2**63 - 1 correctly.
We can't simply define LARGE_OFF_T to be 9223372036854775807,
since some C++ compilers masquerading as C compilers
incorrectly reject 9223372036854775807.  */
#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
  int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
   && LARGE_OFF_T % 2147483647 == 1)
  ? 1 : -1];

So I think we're wasting our time to debate whether we need to support
non-integral off_t ... let's just apply Bruce's version and wait to
see if anyone has a problem before doing more work.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Philip Warner
At 12:00 PM 22/10/2002 -0400, Bruce Momjian wrote:

It does have the advantage of being more portable on systems
that do have integral off_t


I suspect it is no more portable than determining storage order by using 
'int i = 256', then writing in storage order, and has the disadvantage that 
it may break as discussed.

AFAICT, using storage order will not break under any circumstances within 
one OS/architecture (unlike using shift), and will not break any more often 
than using shift in cases where off_t is integral.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Bruce Momjian
Tom Lane wrote:
> Philip Warner <[EMAIL PROTECTED]> writes:
> > None, but it will be compatible with itself (the most we can hope for), and 
> > will work even if shifting is not supported for off_t (how likely is 
> > that?). I agree shift is definitely the way to go if it works on arbitrary 
> > data - ie. it does not rely on off_t being an integer. Can I shift a struct?
> 
> You can't.  If there are any platforms where in fact off_t isn't an
> arithmetic type, then shifting code would break there.  I am not sure
> there are any; can anyone provide a counterexample?
> 
> It would be simple enough to add a configure test to see whether off_t
> is arithmetic (just try to compile "off_t x; x <<= 8;").  How about
>   #ifdef OFF_T_IS_ARITHMETIC_TYPE
>   // cross-platform compatible
>   use shifting method
>   #else
>   // not cross-platform compatible
>   read or write bytes of struct in storage order
>   #endif

It is my understanding that off_t is an integral type and fpos_t is
perhaps a struct.  My fgetpos manual page says:

 The fgetpos() and fsetpos() functions are alternate interfaces equivalent
 to ftell() and fseek() (with whence set to SEEK_SET ), setting and stor-
 ing the current value of the file offset into or from the object refer-
 enced by pos. On some (non-UNIX) systems an ``fpos_t'' object may be a
 complex object and these routines may be the only way to portably reposi-
 tion a text stream.

I poked around and found this Usenet posting:


http://groups.google.com/groups?q=C+off_t+standard+integral&hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=E958tG.8tH%40root.co.uk&rnum=1

stating that while off_t must be arithmetic, it doesn't have to be
integral, meaning it could be float or double, which can't be shifted.

However, since we don't know if we support any non-integral off_t
platforms, and because a configure test would require us to have two
code paths for with/without integral off_t, I suggest we apply my
version of Philip's patch and let's see if everyone can compile it
cleanly.  It does have the advantage of being more portable on systems
that do have integral off_t, which I think is most/all of our supported
platforms.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Tom Lane
Philip Warner <[EMAIL PROTECTED]> writes:
> None, but it will be compatible with itself (the most we can hope for), and 
> will work even if shifting is not supported for off_t (how likely is 
> that?). I agree shift is definitely the way to go if it works on arbitrary 
> data - ie. it does not rely on off_t being an integer. Can I shift a struct?

You can't.  If there are any platforms where in fact off_t isn't an
arithmetic type, then shifting code would break there.  I am not sure
there are any; can anyone provide a counterexample?

It would be simple enough to add a configure test to see whether off_t
is arithmetic (just try to compile "off_t x; x <<= 8;").  How about
#ifdef OFF_T_IS_ARITHMETIC_TYPE
// cross-platform compatible
use shifting method
#else
// not cross-platform compatible
read or write bytes of struct in storage order
#endif

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Philip Warner
At 10:16 AM 21/10/2002 -0400, Tom Lane wrote:

What are
the odds that dumping the bytes in it, in either order, will produce
something that's compatible with any other platform?


None, but it will be compatible with itself (the most we can hope for), and 
will work even if shifting is not supported for off_t (how likely is 
that?). I agree shift is definitely the way to go if it works on arbitrary 
data - ie. it does not rely on off_t being an integer. Can I shift a struct?



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-22 Thread Philip Warner
At 09:52 PM 21/10/2002 -0400, Bruce Momjian wrote:

4) pg_restore  -Fc  

pg_restore  /tmp/x

is enough; it will determine the file type, and by avoiding the pipe, you 
allow it to do seeks which are not much use here, but are usefull when you 
only restore one table in a very large backup.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-21 Thread Larry Rosenman
On Mon, 2002-10-21 at 20:52, Bruce Momjian wrote:
> Larry Rosenman wrote:
> > On Mon, 2002-10-21 at 20:47, Bruce Momjian wrote:
> > > 
> > > Here is a modified version of Philip's patch that has the changes Tom
> > > suggested;  treating off_t as an integral type.  I did light testing on
> > > my BSD/OS machine that has 8-byte off_t but I don't have 4 gigs of free
> > > space to test larger files.  
> > I can make an account for anyone that wants to play on UnixWare 7.1.3.
> 
> If you have 7.3, you can just test this way:
I haven't had the time to play with 7.3 (busy on a NUMBER of other
things). 

I'm more than willing to supply resources, just my time is short right
now. 


>   
>   1) apply the patch
>   2) run the regression tests
>   3) pg_dump -Fc regression >/tmp/x
>   4) pg_restore  -Fc   
> That's all I did and it worked.
> 
> -- 
>   Bruce Momjian|  http://candle.pha.pa.us
>   [EMAIL PROTECTED]   |  (610) 359-1001
>   +  If your life is a hard drive, |  13 Roberts Road
>   +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED]
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-21 Thread Bruce Momjian
Larry Rosenman wrote:
> On Mon, 2002-10-21 at 20:47, Bruce Momjian wrote:
> > 
> > Here is a modified version of Philip's patch that has the changes Tom
> > suggested;  treating off_t as an integral type.  I did light testing on
> > my BSD/OS machine that has 8-byte off_t but I don't have 4 gigs of free
> > space to test larger files.  
> I can make an account for anyone that wants to play on UnixWare 7.1.3.

If you have 7.3, you can just test this way:

1) apply the patch
2) run the regression tests
3) pg_dump -Fc regression >/tmp/x
4) pg_restore  -Fc  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-21 Thread Larry Rosenman
On Mon, 2002-10-21 at 20:47, Bruce Momjian wrote:
> 
> Here is a modified version of Philip's patch that has the changes Tom
> suggested;  treating off_t as an integral type.  I did light testing on
> my BSD/OS machine that has 8-byte off_t but I don't have 4 gigs of free
> space to test larger files.  
I can make an account for anyone that wants to play on UnixWare 7.1.3.



-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED]
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-21 Thread Bruce Momjian

Here is a modified version of Philip's patch that has the changes Tom
suggested;  treating off_t as an integral type.  I did light testing on
my BSD/OS machine that has 8-byte off_t but I don't have 4 gigs of free
space to test larger files.  

ftp://candle.pha.pa.us/pub/postgresql/mypatches/pg_dump

Can others test?

---

Tom Lane wrote:
> Philip Warner <[EMAIL PROTECTED]> writes:
> > then checking the first byte? This should give me the endianness, and makes 
> > a non-destructive write (not sure it it's important). Currently the 
> > commonly used code does not rely on off_t arithmetic, so if possible I'd 
> > like to avoid shift. Does that sound reasonable? Or overly cautious?
> 
> I think it's pointless.  Let's assume off_t is not an arithmetic type
> but some weird struct dreamed up by a crazed kernel hacker.  What are
> the odds that dumping the bytes in it, in either order, will produce
> something that's compatible with any other platform?  There could be
> padding, or the fields might be in an order that doesn't match the
> byte order within the fields, or something else.
> 
> The shift method requires *no* directly endian-dependent code,
> and I think it will work on any platform where you have any hope of
> portability anyway.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to [EMAIL PROTECTED] so that your
> message can get through to the mailing list cleanly
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-20 Thread Bruce Momjian
Philip Warner wrote:
> At 09:18 PM 20/10/2002 -0400, Bruce Momjian wrote:
> >I will try to apply it within the next 48 hours.
> 
> I'm happy to apply it when necessary; but I wouldn't do it until we've from 
> some someone with a big-endian machine...

Well, I think Tom was going to try it on his HPUX machine.  However, it
is on the open items list, so we are going to need to get it in there
soon anyway, or yank it all out.  If no big endian people want to test
it, we will have to ship and then I am sure some big-ending testing will
happen.  ;-)

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-20 Thread Philip Warner
At 09:18 PM 20/10/2002 -0400, Bruce Momjian wrote:

I will try to apply it within the next 48 hours.


I'm happy to apply it when necessary; but I wouldn't do it until we've from 
some someone with a big-endian machine...




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-20 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

I will try to apply it within the next 48 hours.

---


Philip Warner wrote:
> 
> I have put the latest patch at:
> 
>  http://downloads.rhyme.com.au/postgresql/pg_dump/
> 
> along with two dump files of the regression DB, one with 4 byte
> and the other with 8 byte offsets. I can read/restore each from
> the other, so it looks pretty good. Once the endianness is tested,
> we should be OK.
> 
> Known problems:
> 
> - will not cope with > 4GB files and size_t not 64 bit.
> - when printing data position, it is assumed that off_t is UINT64
>(we could remove this entirely - it's just for display)
> - if seek is not supported, then an intXX is assigned to off_t
>when file offsets are needed. This *should* not cause a problem
>since without seek, the offsets will not be written to the file.
> 
> Changes from Prior Version:
> 
> - No longer stores or outputs data length
> - Assumes result of ftello is correct if it disagrees with internally
>kept tally.
> - 'pg_restore -l' now shows sizes of int and offset.
> 
> 
> 
> Philip Warner| __---_
> Albatross Consulting Pty. Ltd.   |/   -  \
> (A.B.N. 75 008 659 498)  |  /(@)   __---_
> Tel: (+61) 0500 83 82 81 | _  \
> Fax: (+61) 0500 83 82 82 | ___ |
> Http://www.rhyme.com.au  |/   \|
>   |----
> PGP key available upon request,  |  /
> and from pgp5.ai.mit.edu:11371   |/
> 
> 
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-18 Thread Philip Warner

I have put the latest patch at:

http://downloads.rhyme.com.au/postgresql/pg_dump/

along with two dump files of the regression DB, one with 4 byte
and the other with 8 byte offsets. I can read/restore each from
the other, so it looks pretty good. Once the endianness is tested,
we should be OK.

Known problems:

- will not cope with > 4GB files and size_t not 64 bit.
- when printing data position, it is assumed that off_t is UINT64
  (we could remove this entirely - it's just for display)
- if seek is not supported, then an intXX is assigned to off_t
  when file offsets are needed. This *should* not cause a problem
  since without seek, the offsets will not be written to the file.

Changes from Prior Version:

- No longer stores or outputs data length
- Assumes result of ftello is correct if it disagrees with internally
  kept tally.
- 'pg_restore -l' now shows sizes of int and offset.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-18 Thread Philip Warner
At 12:07 AM 19/10/2002 +0200, Peter Eisentraut wrote:

Any old machine has a 4-byte off_t if you configure with
--disable-largefile.


Thanks - done. I just dumped to a custom backup file, then dumped it do 
SQL, and compared each version (V7.2.1, 8 byte & 4 byte offsets), and they 
all looked OK. Also, the 4 byte version reads the 8 byte offset version 
correctly - although I have not checked reading > 4GB files with 4 byte 
offset, but it's not a priority for obvious reasons.

So once Giles gets back to me (Monday), I'll commit the changes.




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-18 Thread Peter Eisentraut
Philip Warner writes:

>
> I have made the changes to pg_dump and verified that (a) it reads old
> files, (b) it handles 8 byte offsets, and (c) it dumps & seems to restore
> (at least to /dev/null).
>
> I don't have a lot of options for testing it - should I just apply the
> changes and wait for the problems, or can someone offer a bigendian machine
> and/or a 4 byte off_t machine?

Any old machine has a 4-byte off_t if you configure with
--disable-largefile.  This could be a neat way to test:  Make two
installations configured different ways and move data back and forth
between them until it changes.  ;-)

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-18 Thread Tom Lane
Philip Warner <[EMAIL PROTECTED]> writes:
> I don't have a lot of options for testing it - should I just apply the 
> changes and wait for the problems, or can someone offer a bigendian machine 
> and/or a 4 byte off_t machine?

My HP is big-endian; send in the patch and I'll check it here...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-17 Thread Philip Warner

I have made the changes to pg_dump and verified that (a) it reads old 
files, (b) it handles 8 byte offsets, and (c) it dumps & seems to restore 
(at least to /dev/null).

I don't have a lot of options for testing it - should I just apply the 
changes and wait for the problems, or can someone offer a bigendian machine 
and/or a 4 byte off_t machine?


was integral.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-03 Thread Philip Warner

At 11:07 PM 3/10/2002 -0400, Tom Lane wrote:
>A non-integral representation
>of off_t is theoretically possible but I don't believe it exists in
>practice.

Excellent. So I can just read/write the bytes in an appropriate order and 
expect whatever size it is to be a single intXX.

Fine with me, unless anybody voices another opinion in the next day, I will 
proceed. I just have this vague recollection of seeing a header file with a 
more complex structure for off_t. I'm probably dreaming.





Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
  |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-03 Thread Bruce Momjian

Tom Lane wrote:
> Giles Lean <[EMAIL PROTECTED]> writes:
> > When talking of near-current systems with 64 bit off_t you are not
> > going to find one without support for 64 bit integral types.
> 
> I tend to agree with Giles on this point.  A non-integral representation
> of off_t is theoretically possible but I don't believe it exists in
> practice.  Before going far out of our way to allow it, we should first
> require some evidence that it's needed on a supported or
> likely-to-be-supported platform.
> 
> time_t isn't guaranteed to be an integral type either if you read the
> oldest docs about it ... but no one believes that in practice ...

I think fpos_t is the non-integral one.  I thought off_t almost always
was integral.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-03 Thread Tom Lane

Giles Lean <[EMAIL PROTECTED]> writes:
> When talking of near-current systems with 64 bit off_t you are not
> going to find one without support for 64 bit integral types.

I tend to agree with Giles on this point.  A non-integral representation
of off_t is theoretically possible but I don't believe it exists in
practice.  Before going far out of our way to allow it, we should first
require some evidence that it's needed on a supported or
likely-to-be-supported platform.

time_t isn't guaranteed to be an integral type either if you read the
oldest docs about it ... but no one believes that in practice ...

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-03 Thread Giles Lean


Philip Warner writes:

> Yes, but there is no guarantee that off_t is implemented as such, nor would 
> we be wise to assume so (most docs say explicitly not to do so).

I suspect you're reading old documents, which is why I asked what you
were referring to.  In the '80s what you are saying would have been
best practice, no question: 64 bit type support was not common.

When talking of near-current systems with 64 bit off_t you are not
going to find one without support for 64 bit integral types.

> Again yes, but the problem is the same: we need a way of making the *value* 
> of an off_t portable (not just assuming it's a int64). In general that 
> involves knowing how to turn it into a more universal data type (eg. int64, 
> or even a string).

So you need to know the size of off_t, which will be 32 bit or 64 bit,
and then you need routines to convert that to a portable representation.
The canonical solution is XDR, but I'm not sure that you want to  bother
with it or if it has been extended universally to support 64 bit types.

If you limit the file sizes to 1GB (your less preferred option, I
know;-) then like the rest of the PostgreSQL code you can safely
assume that off_t fits into 32 bits and have a choice of functions
(XDR or ntohl() etc) to deal with them and ignore 64 bit off_t
issues altogether.

If you intend pg_dump files to be portable avoiding the use of large
files will be best.  It also avoids issues on platforms such as HP-UX
where large file support is available, but it has to be enabled on a
per-filesystem basis. :-(

> Does the large file API have functions for representing 
> the off_t values that is portable across architectures? And is the API also 
> portable?

The large files API is a way to access large files from 32 bit
processes.  It is reasonably portable, but is a red herring for
what you are wanting to do.  (I'm not convinced I am understanding
what you're trying to do, but I have 'flu which is not helping. :-)

Regards,

Giles


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-03 Thread Philip Warner

At 07:15 AM 4/10/2002 +1000, Giles Lean wrote:

> > My limited reading of off_t stuff now suggests that it would be brave to
> > assume it is even a simple 64 bit number (or even 3 32 bit numbers).
>
>What are you reading??  If you find a platform with 64 bit file
>offsets that doesn't support 64 bit integral types I will not just be
>surprised but amazed.

Yes, but there is no guarantee that off_t is implemented as such, nor would 
we be wise to assume so (most docs say explicitly not to do so).


> > Unless anyone knows of a documented way to get 64 bit uint/int file
> > offsets, I don't see we have mush choice.
>
>If you're on a platform that supports large files it will either have
>a straightforward 64 bit off_t or else will support the "large files
>API" that is common on Unix-like operating systems.
>
>What are you trying to do, exactly?

Again yes, but the problem is the same: we need a way of making the *value* 
of an off_t portable (not just assuming it's a int64). In general that 
involves knowing how to turn it into a more universal data type (eg. int64, 
or even a string). Does the large file API have functions for representing 
the off_t values that is portable across architectures? And is the API also 
portable?




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
  |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-03 Thread Giles Lean


Philip Warner writes:

> My limited reading of off_t stuff now suggests that it would be brave to 
> assume it is even a simple 64 bit number (or even 3 32 bit numbers).

What are you reading??  If you find a platform with 64 bit file
offsets that doesn't support 64 bit integral types I will not just be
surprised but amazed.

> One alternative, which I am not terribly fond of, is to have pg_dump
> write multiple files - when we get to 1 or 2GB, we just open another
> file, and record our file positions as a (file number, file
> position) pair. Low tech, but at least we know it would work.

That does avoid the issue completely, of course, and also avoids
problems where a platform might have large file support but a
particular filesystem might or might not.

> Unless anyone knows of a documented way to get 64 bit uint/int file 
> offsets, I don't see we have mush choice.

If you're on a platform that supports large files it will either have
a straightforward 64 bit off_t or else will support the "large files
API" that is common on Unix-like operating systems.

What are you trying to do, exactly?

Regards,

Giles




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-03 Thread Mario Weilguni

>My limited reading of off_t stuff now suggests that it would be brave to 
>assume it is even a simple 64 bit number (or even 3 32 bit numbers). One 
>alternative, which I am not terribly fond of, is to have pg_dump write 
>multiple files - when we get to 1 or 2GB, we just open another file, and 
>record our file positions as a (file number, file position) pair. Low tech, 
>but at least we know it would work.
>
>Unless anyone knows of a documented way to get 64 bit uint/int file 
>offsets, I don't see we have mush choice.

How common is fgetpos64? Linux supports it, but I don't know about other
systems.

http://hpc.uky.edu/cgi-bin/man.cgi?section=all&topic=fgetpos64

Regards,
Mario Weilguni

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-03 Thread Philip Warner

At 11:06 AM 2/10/2002 -0400, Tom Lane wrote:
>It needs to get done; AFAIK no one has stepped up to do it.  Do you want
>to?

My limited reading of off_t stuff now suggests that it would be brave to 
assume it is even a simple 64 bit number (or even 3 32 bit numbers). One 
alternative, which I am not terribly fond of, is to have pg_dump write 
multiple files - when we get to 1 or 2GB, we just open another file, and 
record our file positions as a (file number, file position) pair. Low tech, 
but at least we know it would work.

Unless anyone knows of a documented way to get 64 bit uint/int file 
offsets, I don't see we have mush choice.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
  |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-02 Thread Philip Warner

At 11:06 AM 2/10/2002 -0400, Tom Lane wrote:
>It needs to get done; AFAIK no one has stepped up to do it.  Do you want
>to?

I'll have a look; my main concern at the moment is that off_t and size_t 
are totally non-committal as to structure; in particular I can probably 
safely assume that they are unsigned, but can I assume that they have the 
same endian--ness as int etc?

If so, then will it be valid to just read/write each byte in endian order? 
How likely is it that the 64 bit value will actually be implemented as a 
structure like:

off_t { int lo; int hi; }

which effectively ignores endian-ness at the 32 bit scale?




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
  |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-02 Thread Bruce Momjian

Philip Warner wrote:
> At 09:42 AM 2/10/2002 +1000, Philip Warner wrote:
> >Yes, and do the peripheral stuff to support old archives etc.
> 
> Does silence mean people agree? Does it also mean someone is doing this 
> (eg. whoever did the off_t support)? Or does it mean somebody else needs to 
> do it?

Added to open items:

Fix pg_dump to handle 64-bit off_t offsets for custom format


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073


   P O S T G R E S Q L

  7 . 3  O P E NI T E M S


Current at ftp://momjian.postgresql.org/pub/postgresql/open_items.

Source Code Changes
---
Schema handling - ready? interfaces? client apps?
Drop column handling - ready for all clients, apps?
Fix BeOS, QNX4 ports
Fix AIX large file compile failure of 2002-09-11 (Andreas)
Get bison upgrade on postgresql.org for ecpg only (Marc)
Fix vacuum btree bug (Tom)
Fix client apps for autocommit = off
Change log_min_error_statement to be off by default (Gavin)
Fix return tuple counts/oid/tag for rules, SPI
Add schema dump option to pg_dump
Make SET not start a transaction with autocommit off, document it
Remove GRANT EXECUTE to all /contrib functions?
Change NUMERIC to have 16 digit precision
Handle CREATE CONSTRAINT TRIGGER without FROM in loads from old db's
Fix pg_dump to handle 64-bit off_t offsets for custom format

On Going

Security audit


Documentation Changes
-
Document need to add permissions to loaded functions and languages
Move documation to gborg for moved projects


7.2.X
-
CLOG
WAL checkpoint
Linux mktime()



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-02 Thread Tom Lane

Philip Warner <[EMAIL PROTECTED]> writes:
> At 09:42 AM 2/10/2002 +1000, Philip Warner wrote:
>> Yes, and do the peripheral stuff to support old archives etc.

> Does silence mean people agree? Does it also mean someone is doing this 
> (eg. whoever did the off_t support)? Or does it mean somebody else needs to 
> do it?

It needs to get done; AFAIK no one has stepped up to do it.  Do you want
to?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] pg_dump and large files - is this a problem?

2002-10-02 Thread Philip Warner

At 09:42 AM 2/10/2002 +1000, Philip Warner wrote:
>Yes, and do the peripheral stuff to support old archives etc.

Does silence mean people agree? Does it also mean someone is doing this 
(eg. whoever did the off_t support)? Or does it mean somebody else needs to 
do it?





Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
  |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



  1   2   >