Re: [HACKERS] Summary of some postgres portability issues

2008-07-14 Thread Bruce Momjian
Ken Camann wrote:
 In trying to port postgres to 64-bit Windows, I've encountered a
 number of issues which may (or may not) affect other compilers.  If
 you followed the other thread, this is mostly a summary with a bit
 more details so feel free to ignore it.  Some of these may have some
 minor effects on other platforms, so they may be of interest (but I
 doubt it, since no one has noticed/complained about them yet).  This
 post contains a small taxonomy of the problems, as well as some
 discussion about the work that needs to be done in order to make
 postgres portable to LLP64 data model compilers (in case someone else
 is interested).  I use the standard terms for discussing different
 compiler data models, which are explained here:
 
 http://www.unix.org/version2/whatsnew/lp64_wp.html
 
 Throughout this post I will assume sizeof(size_t) == sizeof(void*),
 because I doubt you want to support a system where this is not the
 case.
 
 When I try to compile postgres, I get 396 warnings.  These come from
 several different places:
 
 1.) Most of the code involving strings requires a ILP32 or ILP64 to
 not generate a warning.  This means sizeof(int) == sizeof(size_t) ==
 32 or 64, respectively.  Something as simple as:
 
 int len;
 
 len = strlen(str);
 
...
 
 int len;
 
 len = (int)strlen(str);

If we know we are addressing a datum that has to be 4GB, it would be
best to use some typedef that is descriptive, like:

  len = (DatumLen)strlen(str);

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Summary of some postgres portability issues

2008-07-14 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Ken Camann wrote:
 When I try to compile postgres, I get 396 warnings.  These come from
 several different places:
 
 1.) Most of the code involving strings requires a ILP32 or ILP64 to
 not generate a warning.  This means sizeof(int) == sizeof(size_t) ==
 32 or 64, respectively.  Something as simple as:

That reminds me, I was going to post another comment on this: maybe the
most effective answer for the OP is to suppress the specific warning
about casting size_t to int.  (Maybe his compiler can't do that, but
most commercial compilers I've seen have some such ability.)  Because of
the way that Postgres is designed, and the quite a few years of testing
it has had on 64-bit boxes, it's highly unlikely that such coercions are
actually problems; thus, there is not a whole lot of interest in
invasive and perhaps performance-losing patches to get rid of 'em.

The warnings you *do* want to see are the equivalents of gcc's cast
between pointer and integer of different size --- if that's separable
from warnings about casts between integer sizes then you're in good
shape.

At the very least, I'd suggest that the correct development path is
to fix the pointer-conversion warnings first, so that you can get a
working port; only then worry about cosmetics.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Summary of some postgres portability issues

2008-07-09 Thread Martijn van Oosterhout
On Tue, Jul 08, 2008 at 07:01:10PM -0400, Ken Camann wrote:
 In trying to port postgres to 64-bit Windows, I've encountered a
 number of issues which may (or may not) affect other compilers.  If
 you followed the other thread, this is mostly a summary with a bit
 more details so feel free to ignore it.  Some of these may have some
 minor effects on other platforms, so they may be of interest (but I
 doubt it, since no one has noticed/complained about them yet). 

Just clarifying for myself: you are mostly listing theoretical problems
here, not actual I ran it and got regression failures problems, right?

You spend some time arguing that long is the wrong type for lengths in
memory but since all Datums in postgres are limited to 1GB I don't
understand how this can be a practical problem since that can be stored
in an int and a long on any platform.

Mostly you seem to be noting that whatever compiler you are using is
much stricter than the other compilers used in the buildfarm. Clearly
neither icc nor sun studio find these problems on other 64-bit
platforms.

 Does that mean that
 almost every part of postgres that interacts with memory or the Datum
 type must be read very carefully to rediscover all the assumptions
 behind the code?  Unfortunately I would guess yes.

I don't understand what you mean here: the Datum type has very clear
rules about how it is stored. It is essentially opaque, but given the
typlen you have enough information to know how to copy it for example.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] Summary of some postgres portability issues

2008-07-09 Thread Peter Eisentraut
Am Mittwoch, 9. Juli 2008 schrieb Ken Camann:
 In trying to port postgres to 64-bit Windows, I've encountered a
 number of issues which may (or may not) affect other compilers.

Given that PostgreSQL runs fine on other 64-bit architectures, unsubstantiated 
claims about other compilers being affected are probably just going to 
distract you from the work that needs to be done.  I suggest you concentrate 
on fixing the problem at hand, namely you take the actual compiler warnings 
and errors and develop fixes for them.  Perhaps a generalization will arise 
here or there, but so far you are the only person who has actually seen the 
problems, so all the rest of us are just idly guessing.

That said, if most of the problems can be fixed by replacing int by size_t, 
please go for it.  I think in most cases the use of int was just laziness 
rather than careful planning.

Also, please consider setting up a buildfarm client for this port.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Summary of some postgres portability issues

2008-07-09 Thread Ken Camann
On Wed, Jul 9, 2008 at 3:35 AM, Martijn van Oosterhout
[EMAIL PROTECTED] wrote:

 Just clarifying for myself: you are mostly listing theoretical problems
 here, not actual I ran it and got regression failures problems, right?

Correct.  This is why most of them point out that they are not
actually real problems, just written in a less than ideal way.  Like
all warnings, they might be a problem, and they might not.  Some of
them almost certainly would cause problems, like any time it is
assumed that a long it large enough to hold a memory address.   Many
of them are completely innocent, and it is clear that they are (this
is true with most of them, because they occur in string operations).
Unfortunately they generate the same warning whether they're bad or
not, and its quite hard to look through the compiler output and make
sense of it.  I can't really make a mental note that I already
checked that one and it seemed find because there are 400 of them in
just the main postgres daemon alone.  But its hard to try to find the
bad ones with so many of them (most not problems) on the screen at
once.  I'd fix them, but would anyone be willing to commit that?  I
basically have to fix them anyway, just so I don't have to look at
them.  I can't disable the warning, because I need to see it in the
instances where it is important.  I included them in the summary not
because they are all problems which cause failure, but because I felt
they belonged in a summary of all data-model-related portability
issues.  Note that instances where it is and is not important is
decidable only by a human.  No one is going to input a user name
bigger than 2^32 characters.  You may not truncate a pointer.  These
are the exact same warning (truncating an integer).

 You spend some time arguing that long is the wrong type for lengths in
 memory but since all Datums in postgres are limited to 1GB I don't
 understand how this can be a practical problem since that can be stored
 in an int and a long on any platform.

No, I am not trying to argue this (or at least not from a design
standpoint).  Here is something that probably should have been
obvious, but perhaps is not: I do not know very much about postgres,
or how it works.  I also don't know very much about database
theory/algorithms/programming either.  I am also not interested in
adding new functionality to the system.  I'm just a developer who
wants to use a UDF in a 64-bit DLL with a free database, for a totally
different project that I am working on.  MySQL and Firebird are the
only two databases in which you can currently do this.  I like
postgres a lot more than MySQL and Firebird, plus MySQL crashes on my
very large datasets and Firebird has numerous features missing that I
would like.  The easiest thing (although that is becoming less true)
would be get to postgres to compile as a native 64 bit application.  I
honestly thought it would be a lot easier than I think it will be now.
 If this continues to be such a problem, I will need to move back to
MySQL and hope they fix the bugs.

I know that a Datum cannot be bigger than 1 GB either way, but the
documentation around the Datum typedef notes that Datum must large
enough to hold a pointer.  It does not say why, or where this
assumption gets used, or why it was made.  It's simply a warning that
somewhere, in the very large codebase, this may happen.  Where does it
happen?  Does it ever happen?  I really, really wish I knew or that
someone could just tell me.  Stuff like that is basically the reason
that I post to this mailing list.  My assumption is though, that this
code is old and that no one really remembers all the semantics
surrounding all the uses of it.  Are the regression tests so strong
that I can not even worry about it?  That just seems like a very bad
idea, especially since the documentation warns me that this _will_
happen.

As for what would replace it, I think intptr_t.  This type has the
same size as long on LP32, ILP32, LP64, and ILP64 so there would be no
changes to anything that already works, plus this type can hold a
pointer on LLP64 compilers.  This is a change I have already made.
Then I found several references to the assumption that Datum has long
alignment, from code that produce no warnings.  This would almost
certainly create problems.  It also means I would have to look at
every use of Datum to find stuff like this.  I must also understand it
in full, even when it has little or no documentation.  The story is
similar with long, where it is assumed it can only hold memory
addresses, has pointer alignment, in other places.  As I read this
stuff, I become more and more annoyed with Microsoft's decision to
make an LLP64 compiler in the first place.  To not have the larger of
the two basic integer types be able to hold a pointer is just
ridiculous.  However, despite the embrace of gcc and Linux by the
culture, I believe by the numbers MSVC is the most common C/C++
compiler and by a significant margin.  I 

Re: [HACKERS] Summary of some postgres portability issues

2008-07-09 Thread Martijn van Oosterhout
On Wed, Jul 09, 2008 at 03:03:48PM -0400, Ken Camann wrote:
 On Wed, Jul 9, 2008 at 3:35 AM, Martijn van Oosterhout
 [EMAIL PROTECTED] wrote:
 
  Just clarifying for myself: you are mostly listing theoretical problems
  here, not actual I ran it and got regression failures problems, right?
 
 Correct.  This is why most of them point out that they are not
 actually real problems, just written in a less than ideal way.  

 I'd fix them, but would anyone be willing to commit that?

Probably, as long as the proposed fix don't break any other already
supported platform.

 I know that a Datum cannot be bigger than 1 GB either way, but the
 documentation around the Datum typedef notes that Datum must large
 enough to hold a pointer.  It does not say why, or where this
 assumption gets used, or why it was made.

Firstly, I'd suggest checking out the docs on CREATE TYPE, to see the
possibilities:
http://www.postgresql.org/docs/8.2/static/sql-createtype.html

Basically, a Datum can represent any type you can see from SQL. It must
be big enough to hold a pointer because in most cases, it *is* a
pointer. There's essentially three possibilities:

- Small integral type: eg: float/int4/int2/bool, these are simply
  stored as the Datum itself.
- Fixed width type: eg: point/complex, here the Datum is a pointer to
  the data
- Variable length type: text/varchar, here the Datum is a pointer to a
  header which describes the actual length.

The actual assumptions are listed somewhere in the header defining
Datum.

 As for what would replace it, I think intptr_t.  This type has the
 same size as long on LP32, ILP32, LP64, and ILP64 so there would be no
 changes to anything that already works, plus this type can hold a
 pointer on LLP64 compilers.

Does this type exist on all the other systems? Not everywhere is C99
unfortunatly.

  I don't understand what you mean here: the Datum type has very clear
  rules about how it is stored. It is essentially opaque, but given the
  typlen you have enough information to know how to copy it for example.
 
 Well, that is some good news.  Where can I find these rules?

I think I explained it enough above, if you have any more questions,
just ask.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


[HACKERS] Summary of some postgres portability issues

2008-07-08 Thread Ken Camann
In trying to port postgres to 64-bit Windows, I've encountered a
number of issues which may (or may not) affect other compilers.  If
you followed the other thread, this is mostly a summary with a bit
more details so feel free to ignore it.  Some of these may have some
minor effects on other platforms, so they may be of interest (but I
doubt it, since no one has noticed/complained about them yet).  This
post contains a small taxonomy of the problems, as well as some
discussion about the work that needs to be done in order to make
postgres portable to LLP64 data model compilers (in case someone else
is interested).  I use the standard terms for discussing different
compiler data models, which are explained here:

http://www.unix.org/version2/whatsnew/lp64_wp.html

Throughout this post I will assume sizeof(size_t) == sizeof(void*),
because I doubt you want to support a system where this is not the
case.

When I try to compile postgres, I get 396 warnings.  These come from
several different places:

1.) Most of the code involving strings requires a ILP32 or ILP64 to
not generate a warning.  This means sizeof(int) == sizeof(size_t) ==
32 or 64, respectively.  Something as simple as:

int len;

len = strlen(str);

violates this on LP32, LP64, and LLP64.  AFAIK, there really are no
LP32 compilers around anymore, but LP64 is common (LLP64 is MSVC).

None of these warnings are actually problems, since they involve
strings and realistically the problems which could happen never
will.  Unfortunately, these are actually portability problems, since
you never want to disable narrow cast warnings when supporting
different architectures because some of the warnings could be
important.  If these aren't disabled, they will be very annoying and
make it hard to spot real problems (and tempt people to turn off all
such warnings).  If they are changed, almost 300 lines will need to be
committed, all of which have the not very exciting form:

int len;

len = (int)strlen(str);

the alternative is changing int to size_t everywhere, which several
have objected to because of bloat.  This bloat will only affect LP64
and LLP64, which do not seem to have been the target machines in the
first place.  I'd be willing to make the changes to either form, but I
don't know if anyone would be willing to commit them :)

2.) There is a lot of other code involving memory index and offset
calculations being int.  On ILP64, these will be able to work with
buffers  2 GB.  On LP64 or LLP64, they will not.  On ILP64,
sizeof(int) == sizeof(size_t), but on the other two sizeof(int) 
sizeof(size_t).  Either c.h or postgres.h (I forgot which) defines an
Offset and Index typedef to aid in portability, but they are only
rarely used.  Most of the unchecked conversions from size_t to int are
of the string variety (1), but there are a fair amount of these as
well.

None of these warnings are actually problems even on LP64 or LLP64,
unless the buffers involved are  2 GB.  Buffers  2 GB will work with
no changes on ILP64 only.  Whether the problem domain specifies that
they _can't_ (or probably never should) be  2 GB either way must be
examined on a case by case basis, and I haven't examined that yet.

Thoughts on 1  2
==

I was surprised to see this in the code.  The reason is that both of
these issues affect LP64.  Problems with LLP64 are expected, because
LLP64 basically means Microsoft and therefore support is not usually
a concern of the OSS community.  LP64 on the other hand is any x64
machine using gcc, or at least it was several years ago.  Has that
changed?  Can gcc now be configured to use ILP64 instead?

3.) Some of the code assigns size_t to uint16.  These should elicit
warnings on all compilers, but are almost certainly guaranteed to
never be errors because the use of uint16 implies that the developer
clearly knows that this is the maximum size needed in all situations
and the precision loss never matters (here, the size_t variables being
cast were probably not really supposed to be size_t in the first
place, but Size was used in the RHS with no cast, carelessly).

4.) Some of the code assigns size_t to uint32.  It's unclear if these
cases are like (2) or like (3), and would need to be examined on a
case by case basis.

Problems for LLP64 compilers
==

Almost everywhere the keyword long appears, is a problem for LLP64
whether its a warning or not.  Unfortunately this happens in a very
large number of places.  Size is supposed to be used for memory
resident objects, but rarely is.  Often (signed) long is used, and its
not immediately clear without fully reading through the program
whether it actually needs to be signed or not (i.e., whether it can
safely be changed to Size or not).  Sometimes it does need to be
signed, as allocations can apparently be negative, and this is
required for correct program operation i.e., there are a few:

while !(something_which_should_semantically_never_be_less_than_0  0)

The types of