Re: [HACKERS] Build date for snapshots?

2006-09-07 Thread Jan de Visser
On Thursday 07 September 2006 01:24, Marc G. Fournier wrote:

 How about adding a simple: 'touch snapshotdate' .. in the root directory?
 Would that suffice?

I'd say date +[something]  snapshotdate

I betcha that people that can't remember trivial things will muck up file 
timestamps by copying :)

jan

-- 
--
Jan de Visser                     [EMAIL PROTECTED]

                Baruk Khazad! Khazad ai-menu!
--

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] FE/BE protocol vs. parameterized queries

2006-09-07 Thread Michael Paesold

Tom Lane wrote:

The infrastructure for the former planning method (using the first
Bind's parameters as sample values for estimation, but not as constants)
is still there, but it's not being used now.  Does anyone want to argue
for changing things to plan named statements that way?  I'm of two minds
about it myself; you can make a good case that it'd usually be a win,
but it's also not hard to envision scenarios where it'd be a loss.


Although I don't have a clear opinion myself, I sometimes read on this list 
that people are using prepared statements to get safe, stable plans, i.e. 
plans that don't depend on the specific parameter input.


If you change that, I don't think they will be happy at all. I suggest 
leaving it as-is for 8.2. I think the user (i.e. driver) should be able to 
tell the backend, if they want planning for the first bind, or right at 
prepare.


Best Regards
Michael Paesold

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] FE/BE protocol vs. parameterized queries

2006-09-07 Thread Csaba Nagy
 Although I don't have a clear opinion myself, I sometimes read on this list 
 that people are using prepared statements to get safe, stable plans, i.e. 
 plans that don't depend on the specific parameter input.

I definitely want the possibility of getting stable plans. That's only
possible if the planner does NOT take into account any parameter values.
If the statistics get quicker out of date than it's practical to run
analyze, but the plans would stay stable, it's better not to have
parameter values taken into account.
 
 If you change that, I don't think they will be happy at all. I suggest 
 leaving it as-is for 8.2. I think the user (i.e. driver) should be able to 
 tell the backend, if they want planning for the first bind, or right at 
 prepare.

That would be nice. We would probably use all 3 forms:
  - unnamed statement: prepare based on constant parameters;
  - named statement: prepare based on the first set of parameter values;
  - named statement: prepare generic plan without considering parameter
values;

Cheers,
Csaba.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: UUID/GUID discussion leading to request for hexstring bytea? (was: Re: [HACKERS] TODO: GUID datatype)

2006-09-07 Thread Martijn van Oosterhout
On Wed, Sep 06, 2006 at 05:05:47PM -0400, [EMAIL PROTECTED] wrote:
 2) Create semi-generic types with common bitlengths. Associated
functions work on these semi-generic types. No overhead.
- hexstring128, hexstring160, ...
 
 3) Create a new bytea type that has ascii input and output formats,
probably based around hexstrings. Overhead of 4 bytes.

I think 3) is worthwhile for core, it would have many uses. But you
don't actually need to have a new type for that, just new I/O
functions.

As for 2) I think would be acceptable for contrib to contain some code
that demonstrates how to make fixed length types. It would be fairly
straightforward to make a script where you give it a type name and a
length and it'll spit out the code for that type.

I don't think UUID specific stuff needs to be in core, though you could
make an argument that the hex input/output functions should ignore
dashes, to make it straightforward to store UUIDs directly in there.

Hope this helps,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Win32 hard crash problem

2006-09-07 Thread Gregory Stark

Joshua D. Drake [EMAIL PROTECTED] writes:

 Yes I am fully aware of that. I am only relaying what the customer said.

Yeah sorry, I guess what I sent was pretty obvious to you. I should stop
confusing -general with -hackers :)

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Template0 age is increasing speedily.

2006-09-07 Thread Nimesh Satam
Hi!..

I noticed that the age of template0 is increasing very rapidly..Can you please let me know how we can control this and what causes such problems. 

We also noticed that the database slow downs heavily at a particular time..Can you suggest any tools which will help in diagnosing the root cause behiond the data load.



Regards,
Nimesh.


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Gregory Stark
Martijn van Oosterhout kleptog@svana.org writes:

 So you end up storing the typmod in the Datum itself, which brings you
 right back to varlena.

Not really since the Datum doesn't actually end up on disk in the case of
pass-by-reference.

which leads us to:

 Well, the root of the problem depends on your perspective. If the
 purpose behind all of this is to save disk space, perhaps the root of
 the problem is that disk representation and memory representation are
 intimately tied?

Indeed.

Consider this real table definition I found in a few moments searching for
schemas on google:

PRVDR_CTRL_TYPE_CD: CHAR(2)
PRVDR_NUM: CHAR(6)
NPI: NUMBER
RPT_STUS_CD: CHAR(1)
FY_BGN_DT: DATE
FY_END_DT: DATE
PROC_DT: DATE
INITL_RPT_SW: CHAR(1)
LAST_RPT_SW: CHAR(1)
TRNSMTL_NUM: CHAR(2)
FI_NUM: CHAR(5)
ADR_VNDR_CD: CHAR(1)
FI_CREAT_DT: DATE
UTIL_CD: CHAR(1)
NPR_DT: DATE
SPEC_IND: CHAR(1)
FI_RCPT_DT: DATE

By my count postgres would use 154 bytes for this record. Whereas in fact
there's no need for it to take more than 87 bytes. Almost 100% overhead for
varattlen headers and the padding they necessitate.

This is not a pathological example. This is a very common style of database
schema definition. Many many database tables in the real world are a 1-1
translations of existing flat file databases which have lots of short fixed
length ascii codes. Any database interacting with any old school inventory
management systems, financial databases, marketing database, etc is likely to
be of this form.

So it seems what has to happen here is we need a way of defining a data type
that has a different on-disk representation from its in-memory definition.
That means a lot more cpu overhead since I imagine it will mean pallocing the
in-memory representation before you can actually do anything with the data.

The disk reader and writer functions could probably use the typmod but it
seems what they really want to have access to is the attlen because what they
really want to know is the length of the object that their pointer refers to.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Gregory Stark
Gregory Stark [EMAIL PROTECTED] writes:

 Martijn van Oosterhout kleptog@svana.org writes:

 So you end up storing the typmod in the Datum itself, which brings you
 right back to varlena.

 Not really since the Datum doesn't actually end up on disk in the case of
 pass-by-reference.

Just brain storming here. But what happens if we make Datum 2*sizeof(pointer)
and stored the typmod and/or attlen in it?

Obviously it means the memory use goes up dramatically. But the disk i/o could
potentially be reduced dramatically as well.

Does it let us do anything else we've been dreaming of but not thought doable?
Does it cause any fundamental problems?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Martijn van Oosterhout
On Thu, Sep 07, 2006 at 11:57:26AM +0100, Gregory Stark wrote:
 Just brain storming here. But what happens if we make Datum 2*sizeof(pointer)
 and stored the typmod and/or attlen in it?

The fundamental property of a Datum is that you can pass it by value to
a C function. This generally means it has to fit in a register. On the
whole, the CPU register size is the same as the pointer size, so
2*sizeof(pointer) is unlikely to fit...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-07 Thread Gevik Babakhani
On Wed, 2006-09-06 at 17:05 -0400, [EMAIL PROTECTED] wrote:

 The UUID type itself has value, however, the value it provides is
 limited. Generation of a UUID doesn't have to occur with the database.
 The application inserting the row can generate the UUID. The UUID type
 itself has limited value, in that the difference between a 36 bytes +
 4 bytes = 40 bytes as a fully expanded char(40), or the packed value
 using the hexstring encode/decode SQL functions of a bytea type at 16
 + 4 bytes = 20 bytes compared to a tightly packed UUID type of 16
 bytes, are very close. The argument can easily be made that if space
 (disk space, index size, I/O bandwidth) is your first priority, than
 a UUID is the wrong type to use. A 64-bit integer may suffice.

Agreed.

 
 I'm also having trouble with the idea that a UUID deserves special
 treatment. I currently have a desire to store both UUID and MD5
 checksum in my rows. They are both 128 bits = 16 bytes, and fit all
 of the same requirements above.

Agreed.

 This makes it seem as if a generic 128-bit data type would be
 desirable. They both have a compatible representation of a hexadecimal
 string. The extra '-' characters in the UUID can be easily added when
 necessary by a HEXSTRING2UUID() sort of function in plpgsql or in
 the claling application.

Agreed

 
 But - MD5 isn't the only checksum that is frequently used. Some argue
 that the MD5 can be shown to be weak in some regards, and that perhaps
 other checksum algorithms such as SHA-1 provide a better guarantee of
 uniqueness. SHA-1 isn't 128-bits. It's 160-bits.

To my knowledge most apps use MD5 checksum because it is easier to
implement and use.  
 
 This is where I start to buy Tom Lane's argument that the 4-byte
 prefix is no big deal. I find it more desirable to have a binary data
 type with a hexadecimal string input and output function. The
 flexibility of being able to use 128-bits or 160-bits is worth this
 4-byte header to me. What I don't want to do is store double size
 fields, stored as hexadecimal.

Agreed.

I have experimented with GUID/UUID type for a while and here are my
findings when having the need to store GUID/UUID/MD5 in the database.

1. Almost always these values are created outside the database. So to my
opinion having all kinds of functions in the database to create such
values becomes second priority except when wanting to use GUIDs auto
generated for PK like the uuid datatype in MS SQL

2. Yet I haven't seen any kind of arithmetic operations (+ - * / %) on
stored GUIDs hence these operations would be meaningless. I also never
did see the need to change specific parts of a GUID 

To my opinion only some of relational/compare operations like == and !=
apply to such values. comparing guid = guid or md5  md5 is also
meaningless.

3. Almost always a GUID/MD5 is not changed when it is generated and
stored except for cases like resetting md5sum of a password that is
changed.

To my opinion GUIDs type need to provide the following in the database.

1. GUID type must accept the correct string format(s), with of without
extra '-'
2. GUID type must internally be stored as small as possible.
3. GUID type must be comparable with == , != , LIKE and (NOT) IS NULL
4. GUID type must have the ability to be indexed, grouped, ordered,
DISTINCT... but not MAX(), MIN() or SUM()

I think option three would be a good one to experiment with. I am
thinking of a 16 length struct to hold the GUID value and corresponding
functionality to achieve the above however possible.

I also think we should have the GUID/UUID as a datatype and not just
functions handling hexstring.







---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Andrew - Supernews
On 2006-09-07, Gregory Stark [EMAIL PROTECTED] wrote:
 Consider this real table definition I found in a few moments searching for
 schemas on google:
[snip table with lots of fixed-length char fields]

 By my count postgres would use 154 bytes for this record. Whereas in fact
 there's no need for it to take more than 87 bytes.

Are you sure? Perhaps you are assuming that a char(1) field can be made
to be fixed-length; this is not the case (consider utf-8 for example).

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] getting access to gborg, specifically the jdbc CVS files

2006-09-07 Thread Dave Cramer
I realize this isn't the forum for this, but it's the closest thing  
to it.


In order to move to pgfoundry I need the CVS repository. I can get it  
myself if I can login to gborg, or can someone tar it up and put it  
somewhere ?


Dave

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Gregory Stark

Andrew - Supernews [EMAIL PROTECTED] writes:

 Are you sure? Perhaps you are assuming that a char(1) field can be made
 to be fixed-length; this is not the case (consider utf-8 for example).

Well that could still be fixed length, it would just be a longer fixed length.
(In theory it would have to be 6 bytes long which I suppose would open up the
argument that if you're usually storing 7-bit ascii then a varlena would
usually be shorter.)

In any case I think the intersection of columns for which you care about i18n
and columns that you're storing according to an old-fashioned fixed column
layout is pretty much nil. And not just because it hasn't been updated to
modern standards either. If you look again at the columns in my example you'll
see none of them are appropriate targets for i18n anyways. They're all codes
and even numbers.

In other words if you're actually storing localized text then you almost
certainly will be using a text or varchar and probably won't even have a
maximum size. The use case for CHAR(n) is when you have fixed length
statically defined strings that are always the same length. it doesn't make
sense to store these in UTF8.

Currently Postgres has a limitation that you can only have one encoding per
database and one locale per cluster. Personally I'm of the opinion that the
only correct choice for that is C and all localization should be handled in
the client and with pg_strxfrm. Putting the whole database into non-C locales
guarantees that the columns that should not be localized will have broken
semantics and there's no way to work around things in the other direction.

Perhaps given the current situation what we should have is a cvarchar and
cchar data types that are like varchar and char but guaranteed to always be
interpreted in the c locale with ascii encoding.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Martijn van Oosterhout
On Thu, Sep 07, 2006 at 01:11:49PM +0100, Gregory Stark wrote:
 
 Martijn van Oosterhout kleptog@svana.org writes:
 
  The fundamental property of a Datum is that you can pass it by value to
  a C function. This generally means it has to fit in a register. On the
  whole, the CPU register size is the same as the pointer size, so
  2*sizeof(pointer) is unlikely to fit...
 
 Not having it fit in a register might impact performance but it certainly
 isn't a requirement. You can pass whole structs by value in modern C. (And by
 modern here I don't mean C99, this has been supported since before ANSI and is
 required by *C89*).

Sure, the C compiler pushes it on the stack and passes a pointer to the
function. Pass-by-value in this context means pass a reference to a
copy. It works, but it's not very efficient.

The C compiler also allows you create struct variables and assign them
as if they were plain variables. The assembly code to make this work
isn't pretty. You're proposing doing it for everywhere in the backend,
which seems like a huge cost for very little gain.

A better approach would be to revive the proposal for a variable-length
varlena header. It's four-bytes fixed because that's easy, but given most
values are under 4K you could come up with a coding scheme that cut the
header for such Datums to only 2 bytes, or less...

Have a ncie day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Gregory Stark

Martijn van Oosterhout kleptog@svana.org writes:

 The fundamental property of a Datum is that you can pass it by value to
 a C function. This generally means it has to fit in a register. On the
 whole, the CPU register size is the same as the pointer size, so
 2*sizeof(pointer) is unlikely to fit...

Not having it fit in a register might impact performance but it certainly
isn't a requirement. You can pass whole structs by value in modern C. (And by
modern here I don't mean C99, this has been supported since before ANSI and is
required by *C89*).

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Timezone List

2006-09-07 Thread Magnus Hagander
  In the CVS version there is a table with this information:
  http://developer.postgresql.org/pgdocs/postgres/view-pg-
 timezonenames
  .html
 
  Actually, what that view gives you is timezone offset
 abbreviations,
  not the full zone names that you could use with SET TIME ZONE.
 It
  strikes me that we should have a view for that as well.  We could
 use
  code similar to scan_available_timezones() to generate the view
 output.
 
 Any view over the full timezone names should also include the
 corresponding data from zone.tab in the timezone library source.

Just noticed this mail, so that's not included in my patch. But couldn't
we just load that file up in a separate table if needed, and then join
with it when necessary? 

//Magnus


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-07 Thread Martijn van Oosterhout
On Thu, Sep 07, 2006 at 01:27:21PM +0200, Gevik Babakhani wrote:
 To my opinion only some of relational/compare operations like == and !=
 apply to such values. comparing guid = guid or md5  md5 is also
 meaningless.

snip

 4. GUID type must have the ability to be indexed, grouped, ordered,
 DISTINCT... but not MAX(), MIN() or SUM()

Err, for ordered you need to define , , =, =, which means you're
going to get MAX, and MIN for free...

 I also think we should have the GUID/UUID as a datatype and not just
 functions handling hexstring.

Sure, but that will be the I/O format, right?

Have a nice day,

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Martijn van Oosterhout
On Thu, Sep 07, 2006 at 01:27:01PM +0100, Gregory Stark wrote:
 ... If you look again at the columns in my example you'll
 see none of them are appropriate targets for i18n anyways. They're all codes
 and even numbers.

Which begs the question of why they don't store the numbers in numeric
columns? That'll take far less space than any string.

 In other words if you're actually storing localized text then you almost
 certainly will be using a text or varchar and probably won't even have a
 maximum size. The use case for CHAR(n) is when you have fixed length
 statically defined strings that are always the same length. it doesn't make
 sense to store these in UTF8.

It makes sense to store them as numbers, or perhaps an enum.

 Currently Postgres has a limitation that you can only have one encoding per
 database and one locale per cluster. Personally I'm of the opinion that the
 only correct choice for that is C and all localization should be handled in
 the client and with pg_strxfrm. Putting the whole database into non-C locales
 guarantees that the columns that should not be localized will have broken
 semantics and there's no way to work around things in the other direction.

Quite. So if someone would code up SQL COLLATE support and integrate
ICU, everyone would be happy and we could all go home.

BTW, requireing localisation to happen in the client is silly. SQL
provides the ORDER BY clause for strings and it'd be silly to have the
client resort them just because they're not using C locale. The point
of a database was to make your life easier, right?

 Perhaps given the current situation what we should have is a cvarchar and
 cchar data types that are like varchar and char but guaranteed to always be
 interpreted in the c locale with ascii encoding.

I think bytea gives you that, pretty much.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] UUID/GUID discussion leading to request for

2006-09-07 Thread Gevik Babakhani
On Thu, 2006-09-07 at 14:46 +0200, Martijn van Oosterhout wrote:
 On Thu, Sep 07, 2006 at 01:27:21PM +0200, Gevik Babakhani wrote:
  To my opinion only some of relational/compare operations like == and !=
  apply to such values. comparing guid = guid or md5  md5 is also
  meaningless.
 
 snip
 
  4. GUID type must have the ability to be indexed, grouped, ordered,
  DISTINCT... but not MAX(), MIN() or SUM()
 
 Err, for ordered you need to define , , =, =, which means you're
 going to get MAX, and MIN for free...
 

Yes indeed, I just want to address that guid  guid or any other
operation like that has no meaning. 

  I also think we should have the GUID/UUID as a datatype and not just
  functions handling hexstring.
 
 Sure, but that will be the I/O format, right?

Yes, I/O format as part of the datatype and not a separate function.

 
 Have a nice day,
 


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 On Thu, Sep 07, 2006 at 11:57:26AM +0100, Gregory Stark wrote:
 Just brain storming here. But what happens if we make Datum 2*sizeof(pointer)
 and stored the typmod and/or attlen in it?

 The fundamental property of a Datum is that you can pass it by value to
 a C function. This generally means it has to fit in a register. On the
 whole, the CPU register size is the same as the pointer size, so
 2*sizeof(pointer) is unlikely to fit...

Not to mention the problem that such a change would break every single
datatype-manipulation function in both the core backend and every
user-written datatype.  I don't think we're going there.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] getting access to gborg, specifically the jdbc CVS files

2006-09-07 Thread Dave Page


-Original Message-
From: Dave Cramer [EMAIL PROTECTED]
To: PostgreSQL-development pgsql-hackers@postgresql.org
Sent: 07/09/06 12:48
Subject: [HACKERS] getting access to gborg, specifically the jdbc CVS files

 In order to move to pgfoundry I need the CVS repository. I can get it  
 myself if I can login to gborg, or can someone tar it up and put it  
somewhere ?

For other projects I've setup rsync to copy the cvs hourly, then turned it off  
when everything is committed and synced and the users are all happy.

I can do so for jdbc if you like? If so, what are the project names on each 
site?

Regards, Dave
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] getting access to gborg, specifically the jdbc CVS files

2006-09-07 Thread Dave Cramer

Dave,
On 7-Sep-06, at 10:32 AM, Dave Page wrote:




-Original Message-
From: Dave Cramer [EMAIL PROTECTED]
To: PostgreSQL-development pgsql-hackers@postgresql.org
Sent: 07/09/06 12:48
Subject: [HACKERS] getting access to gborg, specifically the jdbc  
CVS files



In order to move to pgfoundry I need the CVS repository. I can get it
myself if I can login to gborg, or can someone tar it up and put it

somewhere ?

For other projects I've setup rsync to copy the cvs hourly, then  
turned it off  when everything is committed and synced and the  
users are all happy.


I can do so for jdbc if you like? If so, what are the project names  
on each site?


on gborg, the project is pgjdbc
on pgfoundry the project is jdbc



Regards, Dave



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Gregory Stark
Martijn van Oosterhout kleptog@svana.org writes:

 On Thu, Sep 07, 2006 at 01:27:01PM +0100, Gregory Stark wrote:
 ... If you look again at the columns in my example you'll
 see none of them are appropriate targets for i18n anyways. They're all codes
 and even numbers.

 Which begs the question of why they don't store the numbers in numeric
 columns? That'll take far less space than any string.

Sure, unless by PRVDR_NUM they mean things like '01' or even 'C1'.

...

 It makes sense to store them as numbers, or perhaps an enum.

Sure. If you're designing the schema from scratch and don't have to
interoperate with any other systems. But if you're receiving a flat ascii text
file and it has a 5 character opaque identifier called FI_NUM which do you
think is the safer approach for storing these opaque identifiers?

You can suggest that there are other ways of designing a schema that will work
better with Postgres but I think you're just working around Postgre
deficiencies.

These may be deficiencies that are pretty low down your priority list but they
may be higher up my list. I just don't think you can argue they're not
deficiencies just because you know better than to get tripped up by them. I'm
discussing these things with an eye to getting some kind of consensus on what
should be done about them so I can go do it, not because I'm trying to get you
to work on it :)


 Currently Postgres has a limitation that you can only have one encoding per
 database and one locale per cluster. Personally I'm of the opinion that the
 only correct choice for that is C and all localization should be handled in
 the client and with pg_strxfrm. Putting the whole database into non-C locales
 guarantees that the columns that should not be localized will have broken
 semantics and there's no way to work around things in the other direction.

 Quite. So if someone would code up SQL COLLATE support and integrate
 ICU, everyone would be happy and we could all go home.

Well I for one would be pretty unhappy if ICU were integrated. It seems like a
whole pile of code and complexity for no particular gain. The standard i18n
support with a few extensions (namely strcoll_l) seems to be adequate for us
and not introduce huge new dependencies and code burdens.

 BTW, requireing localisation to happen in the client is silly. SQL
 provides the ORDER BY clause for strings and it'd be silly to have the
 client resort them just because they're not using C locale. The point
 of a database was to make your life easier, right?

That's why I mentioned pg_strxfrm. It doesn't solve all your problems if
you're doing lots of string manipulations in queries but it can handle
collation so you can at least execute ORDER BY clauses which of course you
can't efficiently do in the client. For anything more complex you're probably
happier doing your string manipulations in the client just because SQL's
string primitives are so, well, primitive.

 Perhaps given the current situation what we should have is a cvarchar and
 cchar data types that are like varchar and char but guaranteed to always be
 interpreted in the c locale with ascii encoding.

 I think bytea gives you that, pretty much.

Hm, that's an interesting idea. We could define all the string functions for
bytea as well.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Martijn van Oosterhout kleptog@svana.org writes:
 On Thu, Sep 07, 2006 at 11:57:26AM +0100, Gregory Stark wrote:
 Just brain storming here. But what happens if we make Datum 
 2*sizeof(pointer)
 and stored the typmod and/or attlen in it?

 The fundamental property of a Datum is that you can pass it by value to
 a C function. This generally means it has to fit in a register. On the
 whole, the CPU register size is the same as the pointer size, so
 2*sizeof(pointer) is unlikely to fit...

 Not to mention the problem that such a change would break every single
 datatype-manipulation function in both the core backend and every
 user-written datatype.  I don't think we're going there.

Sure, I'm just brain storming. Sometimes thinking about outlandish ideas can
result in quiet reasonable ideas appearing down the line.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Timezone List

2006-09-07 Thread Tom Lane
Gavin Sherry [EMAIL PROTECTED] writes:
 On Wed, 6 Sep 2006, Tom Lane wrote:
 It's somewhat urgent to address this now, because pg_timezonenames is
 sitting on the obvious name for such a view, and once we release 8.2
 we won't be able to change it.  On reflection I think the existing view
 is wrongly named --- perhaps it should be pg_timezoneabbrevs?  Or
 more readably, perhaps pg_timezone_abbrevs, with pg_timezone_names for
 the other view.

 I think 'abbrev' is a like unintuitive. How about 'short_names'?

I'm not wedded to abbrevs, but I don't like short_names because it
suggests that the names in the one view are just shorter forms of the
names in the other view, whereas really they aren't comparable things
at all (eg, EDT and EST5EDT are very different animals, because the
latter includes a set of DST transition-date rules).

I suppose the same argument could be made against abbrevs of course,
but it seems stronger if we have names and short_names.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Problems with extended-Query logging code

2006-09-07 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I happened to notice that the recently added code to log Bind-message
 parameters was printing garbage into my log.  On investigation it turned
 out to be trying to print an already-pfree'd string.  That's fixable,

 Uh, can you show me where?

I didn't trace it down yet, but what I saw was garbage printed at
Execute time in an assert-enabled build.  I think the bind-parameter
string that's being saved in the portal for Execute to print is not
copied into the portal's context.  Since I want to get rid of that bit
anyway it didn't seem critical to identify the exact bug.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Problems with extended-Query logging code

2006-09-07 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  I happened to notice that the recently added code to log Bind-message
  parameters was printing garbage into my log.  On investigation it turned
  out to be trying to print an already-pfree'd string.  That's fixable,
 
  Uh, can you show me where?
 
 I didn't trace it down yet, but what I saw was garbage printed at
 Execute time in an assert-enabled build.  I think the bind-parameter
 string that's being saved in the portal for Execute to print is not
 copied into the portal's context.  Since I want to get rid of that bit
 anyway it didn't seem critical to identify the exact bug.

OK, thanks.  I thought I was doing that with a pstrdup(), but it might
have been in the wrong context at the time.  Thanks.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Martijn van Oosterhout
On Thu, Sep 07, 2006 at 03:38:10PM +0100, Gregory Stark wrote:
 Well I for one would be pretty unhappy if ICU were integrated. It seems like a
 whole pile of code and complexity for no particular gain. The standard i18n
 support with a few extensions (namely strcoll_l) seems to be adequate for us
 and not introduce huge new dependencies and code burdens.

Let's be serious here. The patch is 18k (729 lines), hardly whole pile
of code and complexity. The patch has been in the FreeBSD ports
collection for a rather long time, so it's not like it's not tested.

http://www.freebsd.org/cgi/cvsweb.cgi/ports/databases/postgresql81-server/Makefile?rev=1.156content-type=text/x-cvsweb-markup
and search for ICU. The actual patches are here:
http://people.freebsd.org/~girgen/postgresql-icu/

The point is that strcoll_l doesn't exist on most platforms, so unless
someone is going to write another locale library, why not just use one
that's available?

 That's why I mentioned pg_strxfrm. It doesn't solve all your problems if
 you're doing lots of string manipulations in queries but it can handle
 collation so you can at least execute ORDER BY clauses which of course you
 can't efficiently do in the client. For anything more complex you're probably
 happier doing your string manipulations in the client just because SQL's
 string primitives are so, well, primitive.

I think you're making the assumption that client locale support is
going to be better than the server's.

Besides, pg_strxfrm doesn't help you if you want to do
accent-insensetive matching. Sometimes you don't just want to change
the order, you also want to change what is equal.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


[HACKERS] large object regression tests

2006-09-07 Thread Jeremy Drake
I noticed when I was working on a patch quite a while back that there are
no regression tests for large object support.  I know, large objects
are not the most sexy part of the code-base, and I think they tend to be
ignored/forgotten most of the time.  Which IMHO is all the more reason
they should have some regression tests.  Otherwise, if someone managed to
break them somehow, it is quite likely not to be noticed for quite some
time.

So in this vein, I have recently found myself with some free time, and a
desire to contribute something, and decided this would be the perfect
place to get my feet wet without stepping on any toes.

I guess what I should ask is, would a patch to add a test for large
objects to the regression suite be well received?  And, is there any
advice for how to go about making these tests?

I am considering, and I think that in order to get a real test of the
large objects, I would need to load data into a large object which would
be sufficient to be loaded into more than one block (large object blocks
were 1 or 2K IIRC) so that the block boundary case could be tested.  Is
there any precedent on where to grab such a large chunk of data from?  I
was thinking about using an excerpt from a public domain text such as Moby
Dick, but on second thought binary data may be better to test things with.

My current efforts, and probably the preliminary portion of the final
test, involves loading a small amount (less than one block) of text into a
large object inline from a sql script and calling the various functions
against it to verify that they do what they should.  In the course of
doing so, I find that it is necessary to stash certain values across
statements (large object ids, large object 'handles'), and so far I am
using a temporary table to store these.  Is this reasonable, or is there a
cleaner way to do that?

-- 
Even if you're on the right track, you'll get run over if you just sit there.
-- Will Rogers

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] Backend SSL configuration enhancement

2006-09-07 Thread Victor Wagner
On 2006.09.04 at 15:46:03 -0400, Bruce Momjian wrote:

 Tom Lane wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
   This has been saved for the 8.3 release:
 http://momjian.postgresql.org/cgi-bin/pgpatches_hold
  
  This version was withdrawn by the author for rework, no?
 
 Right, and the thread in patches_hold shows that.  The reason it is in
 there is so we can ping the author at the start of 8.3 to get an updated
 version.

I've already send version 2 of the patch to patches mailing list.
I think that this letter even got into thread mentioned above.

It's a pity that it's to late for patch to get into 8.2.
It means that during all 8.2 lifecycle we'll have to maintain this patch
separately.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Getting a move on for 8.2 beta

2006-09-07 Thread Carlo Florendo

Alvaro Herrera wrote:

Joshua D. Drake wrote:

  
It does not mean all those features are useful, they definitely are. I 
am just trying to look at it from at:


WHIZ* BANG* POW* perspective.



Holy crap, Batman!  This database can do

INSERT INTO foo VALUES (1,1, 'so long'), (42, 2, 'and thanks'),
(142857, 3, 'for all the fish')
  


I just lurk here at pgsql-hackers.

That function would be very cool.

Thanks!

Best Regards,

Carlo Florendo
Astra Philippines Inc.
www.astra.ph


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Ding-dong, contrib is dead ...

2006-09-07 Thread Abhijit Menon-Sen
At 2006-09-05 05:47:58 -, [EMAIL PROTECTED] wrote:

 that difficulty no longer exists now that Abhijit has posted his
 clean-room rewrite (look for otherlock in -patches). Perhaps he
 would be prepared to turn that into a patch against the core...

Absolutely. Just tell me where it should live and I'll post a patch.

-- ams

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Looking at Postgres 8.2

2006-09-07 Thread Strong, David
As part of our ongoing research into Postgres performance and
scalability, we recently downloaded version 8.2 from CVS and we wanted
to pass on some observations.

When comparing 8.2 against 8.1.4, we see that there is roughly a 20%
increase in throughput. We credit most of this improvement to the
modifications made to the way in which the BufMappingLock and
LockMgrLock locks are now handled. Locking sections (partitions) of the
Shared Buffer and Locking hash tables certainly seems to pay off. 

We had also come to the same conclusion and added similar code into a
local copy of 8.1.4. Although, we used SpinLocks rather than LWLocks to
lock sections of the hash table and we used an LWLock to lock the hash
table for critical operations. 

Against 8.1.4, we saw that LWLockAcquire was taking the majority of the
run time, so we added some monitoring code to track the amount of time
spent to acquire and release each lock. The BufMappingLock and
LockMgrLock were, of course, on the top of that list.

We moved some of our 8.1.4 lock monitoring code over to 8.2 to analyze
the lock distribution for the partitions. In doing so, we noted that
WALInsertLock had now become a bottleneck, absorbing most of the time
freed up by the BufMappingLock and LockMgrLock changes.

We took a bold move and made XLogInsert a NOP. The next lock to rise to
the top of the list was SInvalLock. However, the time increase in
SInvalLock did not seem as much as in WALInsertLock. Although we have
not taken steps to do so, we assume that removing SInvalLock may reveal
the next lock that might impede scalability.

Outside of LWLockAcquire (~10%), the next functions that really seem to
take most of the time are HeapTupleSatisfiesSnapshot (~21%) and
PinBuffer (~14%). We're currently profiling and monitoring those
functions.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] New Linux Filesystem: NILFS

2006-09-07 Thread Jeff Davis
On Wed, 2006-09-06 at 22:12 -0400, Christopher Browne wrote:

  Can you elaborate a little? Which filesystems have been problematic?
  Which filesystems are you more confident in?
 
 Well, more or less *all* of them, on AMD-64/Linux.
 
 The pulling the fibrechannel cable test blew them all.  XFS, ext3,
 JFS.  ReiserFS was, if I recall correctly, marginally better, but only
 marginally.
 
 On AIX, we have seen JFS2 falling over when there were enough levels
 of buffering in the way on disk arrays.
 

Well, that's interesting. I suppose I can't count on the filesystem as
much as I thought. Are you implying that the filesystems aren't ready on
64-bit? Is it more of a hardware issue (a controller lying about the
security of the write)? Any comments on FreeBSD/UFS+SU? I would expect
UFS+SU to have similar issues, since it depends on write ordering also.

What do you do for better data security (aside from the obvious don't
pull cables)?

Regards,
Jeff Davis






---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] getting access to gborg, specifically the jdbc CVS

2006-09-07 Thread Dave Page



On 7/9/06 15:34, Dave Cramer [EMAIL PROTECTED] wrote:

 I can do so for jdbc if you like? If so, what are the project names
 on each site?
 
 on gborg, the project is pgjdbc
 on pgfoundry the project is jdbc

Hi Dave,

I've setup a cron job to sync pgFoundry to Gborg every hour at half past,
and run an initial sync. When your developers have committed the last of
their changes to Gborg and are happy that everything looks good on pgFoundry
I'll remove the cron job and you can continue to work on the pgFoundry site.

Please let me know when you want to me complete the migration.

Regards, Dave.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Bruce Momjian
Peter Eisentraut wrote:
 Gregory Stark wrote:
  By my count postgres would use 154 bytes for this record. Whereas in
  fact there's no need for it to take more than 87 bytes. Almost 100%
  overhead for varattlen headers and the padding they necessitate.
 
 The thing is, 100% extra space is cheap, but the processing power for 
 making the need for that extra space go away is not.

I think it would be good to see if we can extend the varlena data types
to support a shorter header for storing short byte values.  Looking at
the header now we have:

#define VARATT_FLAG_EXTERNAL0x8000
#define VARATT_FLAG_COMPRESSED  0x4000
#define VARATT_MASK_FLAGS   0xc000
#define VARATT_MASK_SIZE0x3fff

#define VARATT_SIZEP(_PTR)  (((varattrib *)(_PTR))-va_header)

so there is precedent for overloading that header, but currently all the
headers are four bytes.  The big question is can a bit be allocated to
indicate a short byte header is being used?  Can we do this with minimal
performance impact for non-short values?

One test would be to adjust the masks above to assign one bit to be the
I am a short value header, and I think that leaves you with 5 bits ==
32, which is probably enough for a test.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Getting a move on for 8.2 beta

2006-09-07 Thread Kevin Brown
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  Well, it's taken us the full month to get this far through the queue, so
  I'd sure like to have more people on board reviewing next time.  Alvaro
  helped but I miss Neil Conway, and some other people have helped in the
  past.  However --- the present situation has nothing to do with lack of
  reviewers, it's lack of time to finish the patches.
 
  I did try to get us additional help in reviewing.  Neil was unavailable,
  and Alvaro could only give part of his time.
 
 It strikes me that setting feature freeze in midsummer might not be the
 best strategy for having manpower available to review --- people tend to
 be on vacation in August.  Maybe the answer is just to move the dates a
 bit one way or the other.

Hmm...but if you're going to do that, why not do that now: push the
beta date back by, say, a month (or however long you had in mind) for
this cycle.  That way, the two major patches that are likely to be
dropped for this cycle stand a chance to make it into this release,
and you accomplish your goal of moving the dates a bit all at the same
time.



-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Postgres tracking - the pgtrack project

2006-09-07 Thread Jim C. Nasby
On Sat, Sep 02, 2006 at 06:31:51PM +0100, Dave Page wrote:
  BTW, another output thing you might consider is having draft release
  notes ready-to-go on demand.  Currently, Bruce prepares the release
  notes on the basis of a very tedious scan of the CVS commit logs.
  If this sort of stuff were being dropped into a tracker as it went
  into the CVS tree, at least the research part of making the notes would
  be zero-effort (or perhaps better to say that the work would be spread
  out instead of concentrated).
 
 We have the developers update the CHANGELOG file with each non-trivial
 change in pgAdmin. It works very well for us and producing release
 announcements etc. is a trivial task. The overhead on each developer is
 virtually zero as well as they have to compose some text for the commit
 message anyway.

Another possibility would be to annotate commit messages that should get
added to the release notes and have something automagically cull those
out of CVS.
-- 
Jim C. Nasby, Database Architect   [EMAIL PROTECTED]
512.569.9461 (cell) http://jim.nasby.net

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Postgres tracking - the pgtrack project

2006-09-07 Thread Jim C. Nasby
On Sun, Sep 03, 2006 at 09:18:00AM +0200, Peter Eisentraut wrote:
 Joshua D. Drake wrote:
  http://pgbugs.commandprompt.com (still need to configure email).
 
 Thank you for that.
 
 I think an issue tracking system for patches and such may need to be 
 distinct from a bug-tracking system such as bugzilla, but let's get one 
 thing after another up.

Actually, I've generally found bugzilla to be a decent tool for
general-purpose tracking. Feature requests and what-not need a lot of
the same capabilities of a bug tracker, and having everything in one
tool is often very handy.
-- 
Jim C. Nasby, Database Architect   [EMAIL PROTECTED]
512.569.9461 (cell) http://jim.nasby.net

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] gBorg status?

2006-09-07 Thread Marc G. Fournier


everything should be back up and running ... vServer is now running on our 
newest, 64bit HP server with FreeBSD 6.x ...


On Wed, 6 Sep 2006, Marc G. Fournier wrote:


On Mon, 4 Sep 2006, Dave Page wrote:

My understanding is that Gborg is being recovered from backup as I type. I 
also understand that the delay was not caused by lack of backups or 
anything similarly scary, but simply by other priorities.


Yes, I have the backup uploaded right now, and doing 'recover procedures' as 
I type this ... it will be up before I go to bed tonight ...


As for the 'other priorities' ... the delay was due to an insufficient 
Internet connection to upload the backup ... we were waiting for our Internet 
to be installed in our new location, and were using an underpowered temporary 
connection in the interim ... that one was going to tie up the connection for 
40 hours ... once we were able to get the backup server onto its 'permanent 

connection', the upload took ~5hrs ...

As I mentioned, am working on it right now ... will post a follow up once 
she's back up live and needing to be tested ... she's also moving to one of 
our 64bit servers, so should be a wee bit better performance wise ...



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664

---(end of broadcast)---
TIP 4: Have you searched our list archives?

 http://archives.postgresql.org




Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Gregory Stark
Martijn van Oosterhout kleptog@svana.org writes:

 On Thu, Sep 07, 2006 at 03:38:10PM +0100, Gregory Stark wrote:
  Well I for one would be pretty unhappy if ICU were integrated. It seems 
  like a
  whole pile of code and complexity for no particular gain. The standard i18n
  support with a few extensions (namely strcoll_l) seems to be adequate for us
  and not introduce huge new dependencies and code burdens.
 
 Let's be serious here. The patch is 18k (729 lines), hardly whole pile
 of code and complexity. The patch has been in the FreeBSD ports
 collection for a rather long time, so it's not like it's not tested.

Uhm, an ICU source tree is over 40 *megabytes*. That's almost as much as the
rest of Postgres itself and that doesn't even include documentation. Even if
you exclude the data and regression tests you're still talking about depending
on the portability and correctness of over 10 megabytes of new code.

 The point is that strcoll_l doesn't exist on most platforms, so unless
 someone is going to write another locale library, why not just use one
 that's available?

Neither is ICU available on most platforms. In any case we only need strcoll_l
as a performance optimization, the regular interface works, it's just slow.

 I think you're making the assumption that client locale support is
 going to be better than the server's.

Well we know it is because Postgres's support is basically nonexistent.

 Besides, pg_strxfrm doesn't help you if you want to do
 accent-insensetive matching. Sometimes you don't just want to change
 the order, you also want to change what is equal.

Well equal is part of collation at least in the sense you mean. What it
doesn't help with is things like tolower or regexp matching. These are the
things that I would suggest you usually want to be doing on the client because
SQL's string manipulation facilities are so poor compared to most client
languages.

-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Gregory Stark

Peter Eisentraut [EMAIL PROTECTED] writes:

 Gregory Stark wrote:
  By my count postgres would use 154 bytes for this record. Whereas in
  fact there's no need for it to take more than 87 bytes. Almost 100%
  overhead for varattlen headers and the padding they necessitate.
 
 The thing is, 100% extra space is cheap, but the processing power for 
 making the need for that extra space go away is not.

That's simply untrue for most applications. Unless you can fit much of your
database into RAM that 100% extra space translates directly into 100% slower.

This is most obviously the case for data warehouses that are doing lots of
sequential scans of tables that don't fit in cache. 

But it's largely true for OLTP applications too. The more compact the data the
more tuples fit on a page and the greater the chance you have the page you
need in cache.

-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes:

 I think it would be good to see if we can extend the varlena data types
 to support a shorter header for storing short byte values.  Looking at
 the header now we have:

This isn't the first time we've been down that route. There were some
extensive discussions a while back. I think there were even patches.
I don't remember why it was eventually rejected. I suspect it simply got too
complex.

But I think this is a dead-end route. What you're looking at is the number 1
repeated for *every* record in the table. And what your proposing amounts to
noticing that the number 4 fits in a byte and doesn't need a whole word to
store it. Well sure, but you don't even need a byte if it's going to be the
same for every record in the table.

If someone popped up on the list asking about whether Postgres compressed
their data efficiently if they stored a column that was identical throughout
the whole table you would tell them to normalize their data.

-- 
greg


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] log_duration is redundant, no?

2006-09-07 Thread Tom Lane
AFAICS, there is absolutely no difference anymore between turning
log_duration ON and setting log_min_duration_statement to zero.
ISTM that having the two redundant GUC settings is just confusing,
and we should remove log_duration to simplify things.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Peter Eisentraut
Gregory Stark wrote:
 By my count postgres would use 154 bytes for this record. Whereas in
 fact there's no need for it to take more than 87 bytes. Almost 100%
 overhead for varattlen headers and the padding they necessitate.

The thing is, 100% extra space is cheap, but the processing power for 
making the need for that extra space go away is not.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Guillaume Smet

Tom,

On 9/7/06, Tom Lane [EMAIL PROTECTED] wrote:

AFAICS, there is absolutely no difference anymore between turning
log_duration ON and setting log_min_duration_statement to zero.
ISTM that having the two redundant GUC settings is just confusing,
and we should remove log_duration to simplify things.


log_duration can be useful if we restore a behaviour similar to what
it did in 7.4 ie when you can log duration only for every query. This
way you can have a global overview of your database activity.
I mean:
log_duration = on
log_min_duration_statement = 500
would log only duration for queries faster than 500 ms and duration +
query text for queries slower than 500ms (we can easily avoid
redundancy).

This allows you to have all query statistics without too much I/O. It
doesn't give you all the information as you don't have the query text
but it can give you useful statistics (to explain why the database
server is suddenly slower - we executed 2 millions queries last month
- now we execute 8 millions queries. We cannot easily have this sort
of information as a log_min_duration_statement = 500 won't give it and
a log_min_duration_statement = 0 generates far too I/O).

For another real life example, we are switching from a manual vacuum
to autovacuum for one of our customers and we want to know the
consequences. The query text is useless for a first analysis (and will
slow down the database too much) as we just want to detect if it's
slower or not.

Could we consider reintroduce the old behaviour rather than removing
this setting (we can rename it to a better name if needed)? I already
have a patch for that as we run a patched version of 8.1.4 here to
have this very behaviour.

--
Guillaume

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Tom Lane
Guillaume Smet [EMAIL PROTECTED] writes:
 I mean:
 log_duration = on
 log_min_duration_statement = 500
 would log only duration for queries faster than 500 ms and duration +
 query text for queries slower than 500ms (we can easily avoid
 redundancy).

I don't find this very persuasive --- it sounds awfully messy, and in
fact isn't that exactly the old behavior we got rid of because no one
could understand it?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread David Fetter
On Thu, Sep 07, 2006 at 06:06:51PM -0400, Tom Lane wrote:
 Guillaume Smet [EMAIL PROTECTED] writes:
  I mean:
  log_duration = on
  log_min_duration_statement = 500
  would log only duration for queries faster than 500 ms and
  duration + query text for queries slower than 500ms (we can easily
  avoid redundancy).
 
 I don't find this very persuasive --- it sounds awfully messy, and
 in fact isn't that exactly the old behavior we got rid of because no
 one could understand it?

Guillaume's the author of pgfouine, which understands it and helps
others to do same.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Josh Berkus
Tom,

 I don't find this very persuasive --- it sounds awfully messy, and in
 fact isn't that exactly the old behavior we got rid of because no one
 could understand it?

Well, we want analogous functionality.   We could stand to have it 
named/organized differently.   But maybe we should hold those chages for 
8.3, so that they can be tested properly?

I am finding that the log format prior to Bruce's change, which we were 
using for TPCE, makes it very hard to do log digest analysis if you use 
SPs or prepared queries at all.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] ECPG/OpenBSD buildfarm failures, take I

2006-09-07 Thread Stefan Kaltenbrunner
Tom Lane wrote:
 Michael Fuhr [EMAIL PROTECTED] writes:
 In Plauger's _The Standard C Library_ (1992) on p 335 is an excerpt
 from the standard (I think).   At the end of a section entitled
 7.10.1.4 The strtod function is the following: If the correct
 value would cause underflow, zero is returned and the value of the
 macro ERANGE is stored in errno.
 
 The Single Unix Spec also makes it clear that ERANGE on underflow is not
 optional:
 http://www.opengroup.org/onlinepubs/007908799/xsh/strtod.html
 
 I think there is no question that OpenBSD is broken.  The question for
 us is whether we should expend effort to work around that.  We already
 have a small-is-zero workaround comparison file in the main regression
 tests, so my thought is that ecpg should probably do likewise ...

The openbsd guys are already aware of the issue and working on a solution:

http://marc.theaimsgroup.com/?l=openbsd-techm=115756205505000w=2


Stefan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Guillaume Smet

On 9/8/06, Tom Lane [EMAIL PROTECTED] wrote:

I don't find this very persuasive --- it sounds awfully messy, and in
fact isn't that exactly the old behavior we got rid of because no one
could understand it?


I gave real use cases and we use it every day. It really helps us as a
PostgreSQL hosting company.

The fact is that no tool could really exploit this behaviour before. I
agree it's a totally useless information if you don't have a tool to
analyze the logs. This is no longer the case as pgFouine can extract
this information and make it useful by aggregating it.

Perhaps we could rename it to log_all_duration (my english is not that
good so I'm not sure it's a good name) and explain how it can be
useful in the documentation.

--
Guillaume

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Bruce Momjian
Gregory Stark wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
 
  I think it would be good to see if we can extend the varlena data types
  to support a shorter header for storing short byte values.  Looking at
  the header now we have:
 
 This isn't the first time we've been down that route. There were some
 extensive discussions a while back. I think there were even patches.
 I don't remember why it was eventually rejected. I suspect it simply got too
 complex.
 
 But I think this is a dead-end route. What you're looking at is the number 1
 repeated for *every* record in the table. And what your proposing amounts to
 noticing that the number 4 fits in a byte and doesn't need a whole word to
 store it. Well sure, but you don't even need a byte if it's going to be the
 same for every record in the table.
 
 If someone popped up on the list asking about whether Postgres compressed
 their data efficiently if they stored a column that was identical throughout
 the whole table you would tell them to normalize their data.

I am confused.  You don't want to shrink the header but instead compress
duplicate values in the same row to a single entry?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Bruce Momjian
Guillaume Smet wrote:
 On 9/8/06, Tom Lane [EMAIL PROTECTED] wrote:
  I don't find this very persuasive --- it sounds awfully messy, and in
  fact isn't that exactly the old behavior we got rid of because no one
  could understand it?
 
 I gave real use cases and we use it every day. It really helps us as a
 PostgreSQL hosting company.
 
 The fact is that no tool could really exploit this behaviour before. I
 agree it's a totally useless information if you don't have a tool to
 analyze the logs. This is no longer the case as pgFouine can extract
 this information and make it useful by aggregating it.
 
 Perhaps we could rename it to log_all_duration (my english is not that
 good so I'm not sure it's a good name) and explain how it can be
 useful in the documentation.

If you are using an external tool, can't you just restrict what you
display based on the logged duration?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 If you are using an external tool, can't you just restrict what you
 display based on the logged duration?

I think his basic complaint is that doing the full logging pushup for
even short-duration queries is too expensive, and that logging only the
duration and not the query text or parameters makes a significant speed
difference.  I'm not at all sure that I buy that, but if it's true then
subsequent filtering obviously doesn't help.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Guillaume Smet

On 9/8/06, Bruce Momjian [EMAIL PROTECTED] wrote:

If you are using an external tool, can't you just restrict what you
display based on the logged duration?


It's not a matter of having too much information in our reports (the
more information I have, the happier I am :)). It's a matter of
slowing down too much the server with too much I/O.
We can afford to log every duration and queries slower than 500ms
nearly without any overhead. We can't afford to log every query, it
generates too much I/O - note that we tried to do it and it was really
too slow.
With the former configuration we log 1.2 GB/day, with the latter I
suspect it will be far more than 60 GB/day (I don't have the exact
number as we can't do it for real but queries slower than 500 ms
represents 1/100 of the total amount of queries).

Query logging is really a nice way to monitor the activity of a
PostgreSQL server and the overhead is not that high if logging I/O are
not too intensive.

--
Guillaume

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  If you are using an external tool, can't you just restrict what you
  display based on the logged duration?
 
 I think his basic complaint is that doing the full logging pushup for
 even short-duration queries is too expensive, and that logging only the
 duration and not the query text or parameters makes a significant speed
 difference.  I'm not at all sure that I buy that, but if it's true then
 subsequent filtering obviously doesn't help.

Well, except for bind, all the log output display is zero cost, just a
printf(), as I remember.  The only cost that is significant, I think, is
the timing of the query, and that is happening for all the setttings
discussed.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] New Linux Filesystem: NILFS

2006-09-07 Thread Chris Browne
[EMAIL PROTECTED] (Jeff Davis) writes:
 On Wed, 2006-09-06 at 22:12 -0400, Christopher Browne wrote:

  Can you elaborate a little? Which filesystems have been problematic?
  Which filesystems are you more confident in?
 
 Well, more or less *all* of them, on AMD-64/Linux.
 
 The pulling the fibrechannel cable test blew them all.  XFS, ext3,
 JFS.  ReiserFS was, if I recall correctly, marginally better, but only
 marginally.
 
 On AIX, we have seen JFS2 falling over when there were enough levels
 of buffering in the way on disk arrays.

 Well, that's interesting. I suppose I can't count on the filesystem
 as much as I thought. Are you implying that the filesystems aren't
 ready on 64-bit?

I don't think this necessarily is a 64 bit issue; it's more that with
the more esoteric, expensive disk array hardware, there are fewer with
the ability to test it, because you need $200K worth of hardware
around to do the testing.

 Is it more of a hardware issue (a controller lying about the
 security of the write)? Any comments on FreeBSD/UFS+SU? I would
 expect UFS+SU to have similar issues, since it depends on write
 ordering also.

 What do you do for better data security (aside from the obvious
 don't pull cables)?

The last time we looked, FreeBSD wasn't an option at all, because
there wasn't any suitable FibreChannel support.  That may have
changed; haven't researched lately.

The trouble that the NILFS people pointed out seems a troublesome one,
namely that the more levels of cacheing (even if battery-backed), the
less certain you can be that the hardware isn't lying about write
ordering.

I haven't got an answer...
-- 
let name=cbbrowne and tld=cbbrowne.com in String.concat @ [name;tld];;
http://linuxdatabases.info/info/multiplexor.html
Jury  -- Twelve  people  who  determine which  client  has the  better
lawyer.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Well, except for bind, all the log output display is zero cost, just a
 printf(), as I remember.  The only cost that is significant, I think, is
 the timing of the query, and that is happening for all the setttings
 discussed.

On a machine with slow gettimeofday(), measuring duration at all is
going to hurt, but apparently that is not Guillaume's situation ---
what's costing him is sheer log volume.  And remember that the
slow-gettimeofday problem exists mainly on cheap PCs, not server-grade
hardware.  Based on his experience I'm prepared to believe that there
is a use-case for logging just the duration for short queries.

It seems like we should either remove the separate log_duration boolean
or make it work as he suggests.  I'm leaning to the second answer now.
What's your vote?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Domains and subtypes, a brief proposal

2006-09-07 Thread elein

As many of you know I've been contemplating the implementation
of Domains and subtypes.

DISCLAIMER:  This is a proposal only.  The actual work needs to
be picked up by someone in a better place to work on the code
than I am.  For various reasons, I can only be an active reference 
and tester on this projects.

DISCLAIMER:  This is a proposal, not a detailed spec.  A more detailed
spec can be written with the help of a developer.


Domains and subtypes.

   * Create new child type from values in parent type.
   * Maintain only checks for constraints
   * Create implicit casts from child to parent

With this model all of the type checking that exists should
work and domain checking of types in any place except for
constraint checking can be removed.

   * Add proper SQL standard CREATE childtype UNDER parenttype 
 This should include all of the ordinary create type options.

Constraints on types:

   * Change the pg_types to hold a NULLABLE constraint text column
 OR add a type constraint lookup table (pg_domains?)

This is a big deal, I know.  Constraints for domains would be in
that field.  All domain checking should be done from that source.

This would theoretically enable type constraint checking for any type
if we chose to add this feature to create type.  This would be appropriate
for base types only, including UDTs. There may be an argument for complex 
types, though, for example certain types of integer-ish arrays may only contain 
unique values.  Usually types have the constraint definition encoded in
their input routines.  This gives the type developers an easier option
to validate their types since constraints can be written in pl languages.

Problems and Issues:

There are a few known issues with the current implementation of domains
with overriding operators.

   * Creating the table with an domain PRIMARY KEY did not use the subtype 
comparison function.   
 It was necessary to create a unique index which explicitly used the domain 
operator class.
 This should be fixed by having the domain as a proper type. No domain 
checking should be necessary.
   * ORDER BY requires USING op clause.
 This may be fixed by having the domain as a proper type. No domain 
checking should be necessary.
   * LIKE requires explicit casting of the second argument to text.
 (I need to double check this. It may work OK on cvs head.)
   * COPY has a problem which still need investigating.


Why do we want this.

* We need subtypes 
* Domains are *almost* subtypes because you can override the operators
 as you can with any type.
   * Use the types as designed

The current code jumps through hoops to check for domain types. Ideally
this change should remove a lot of that code in favor of checking just
for the existence of a constraint and then only where constraint checking
is needed.

By using this technique Illustra easily added subtypes, but did not add the
domains with their additional complexities.  No special type checking for 
simple subtypes was necessary.

I may have missed some stuff here. Obviously.  For example how to divide and
conquer the various aspects of the issues raised here. But this is a high, high
level proposal at this time.

Comments, volunteers are welcome.

--elein

--
[EMAIL PROTECTED]Varlena, LLCwww.varlena.com
(510)655-2584(o) (510)543-6079(c)
  PostgreSQL Consulting, Support  Training   
--

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Guillaume Smet

On 9/8/06, Tom Lane [EMAIL PROTECTED] wrote:

I think his basic complaint is that doing the full logging pushup for
even short-duration queries is too expensive, and that logging only the
duration and not the query text or parameters makes a significant speed
difference.  I'm not at all sure that I buy that, but if it's true then
subsequent filtering obviously doesn't help.


That's exactly my point. And on our highly loaded servers, the
log_duration behaviour makes the difference between:
- we have a clear overview of the server activity and
- we don't have any idea of what happens on this server (apart that
there are queries slower than X ms).

Regards,

--
Guillaume

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Well, except for bind, all the log output display is zero cost, just a
  printf(), as I remember.  The only cost that is significant, I think, is
  the timing of the query, and that is happening for all the setttings
  discussed.
 
 On a machine with slow gettimeofday(), measuring duration at all is
 going to hurt, but apparently that is not Guillaume's situation ---
 what's costing him is sheer log volume.  And remember that the
 slow-gettimeofday problem exists mainly on cheap PCs, not server-grade
 hardware.  Based on his experience I'm prepared to believe that there
 is a use-case for logging just the duration for short queries.
 
 It seems like we should either remove the separate log_duration boolean
 or make it work as he suggests.  I'm leaning to the second answer now.
 What's your vote?

#2, I think, but I am confused if you don't know the query, how valuable
is the log_duration.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Template0 age is increasing speedily.

2006-09-07 Thread Merlin Moncure

On 9/7/06, Nimesh Satam [EMAIL PROTECTED] wrote:

We also noticed that the database slow downs heavily at a particular
time..Can you suggest any tools which will help in diagnosing the root cause
behiond the data load.


possible checkpoint?  poorly formulated query?  it could be any number
of things.  use standard tools to diagnose the problem, including:

unix tools: top, vmstat, etc
postgresql query logging, including min_statement_duration
explain analyze

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] pgindent run coming

2006-09-07 Thread Bruce Momjian
It is about time to run pgindent before we enter beta testing.  Is this
weekend good for everyone?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Release notes

2006-09-07 Thread Bruce Momjian
I have started working on release notes for 8.2.  I should have a draft
list by Saturday.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Alvaro Herrera
Bruce Momjian wrote:

 #2, I think, but I am confused if you don't know the query, how valuable
 is the log_duration.

Statistics?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Bruce Momjian
Alvaro Herrera wrote:
 Bruce Momjian wrote:
 
  #2, I think, but I am confused if you don't know the query, how valuable
  is the log_duration.
 
 Statistics?

Oh, interesting.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Domains and subtypes, a brief proposal

2006-09-07 Thread Josh Berkus
Elein,

 I may have missed some stuff here. Obviously.  For example how to divide
 and conquer the various aspects of the issues raised here. But this is a
 high, high level proposal at this time.

I'm not quite clear on what in your proposal is different from current Domain 
behavior.  Or are you just looking to remove the limitations on where Domains 
can be used?

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pgindent run coming

2006-09-07 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 It is about time to run pgindent before we enter beta testing.  Is this
 weekend good for everyone?

I think we should wait until the fate of the GUC patch is determined
--- if we want to apply it, a pgindent run is going to cause some
unnecessary work, plus we'd need to re-indent afterwards.

I've never been a fan of waiting till the last minute to run pgindent,
but I don't think we have to do it before first beta.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pgindent run coming

2006-09-07 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  It is about time to run pgindent before we enter beta testing.  Is this
  weekend good for everyone?
 
 I think we should wait until the fate of the GUC patch is determined
 --- if we want to apply it, a pgindent run is going to cause some
 unnecessary work, plus we'd need to re-indent afterwards.
 
 I've never been a fan of waiting till the last minute to run pgindent,
 but I don't think we have to do it before first beta.

OK.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] large object regression tests

2006-09-07 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 I noticed when I was working on a patch quite a while back that there are
 no regression tests for large object support.

Yeah, this is bad :-(

 I am considering, and I think that in order to get a real test of the
 large objects, I would need to load data into a large object which would
 be sufficient to be loaded into more than one block (large object blocks
 were 1 or 2K IIRC) so that the block boundary case could be tested.  Is
 there any precedent on where to grab such a large chunk of data from?

There's always plain old junk data, eg, repeat('xyzzy', 10).
I doubt that Moby Dick would expose any unexpected bugs ...

 ... I find that it is necessary to stash certain values across
 statements (large object ids, large object 'handles'), and so far I am
 using a temporary table to store these.  Is this reasonable, or is there a
 cleaner way to do that?

I think it's supposed to be possible to use psql variables for that;
if you can manage to test psql variables as well as large objects,
that'd be a double bonus.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Fixed length data types issue

2006-09-07 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Peter Eisentraut [EMAIL PROTECTED] writes:
 The thing is, 100% extra space is cheap, but the processing power for 
 making the need for that extra space go away is not.

 That's simply untrue for most applications.

Well, it's true for some and not true for others: we hear from plenty of
people who seem to be more CPU-bound than IO-bound, and the former group
would not like a change along this line.  The trick with any space-saving
change would be to not expend so many cycles as to make things a lot
worse for the CPU-bound crowd.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] A note about buildfarm ecpg-check

2006-09-07 Thread Tom Lane
I see that the buildfarm script seems to be running ecpg-check pretty
early in the sequence.  Considering that the ecpg tests are still far
from stable, this seems to be taking away the opportunity to learn as
much as we can from a buildfarm run.  Could we run the ecpg tests last?

An even better idea would be to teach the script about test dependencies
so that it could run test steps even when an earlier-but-unrelated test
had failed.  But I'm sure that's a lot more work.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] Fix linking of OpenLDAP libraries

2006-09-07 Thread Albe Laurenz
Tom Lane wrote:
 I have realized that my modifications in configure.in and
 src/interfaces/libpq/Makefile to link libpq against
 OpenLDAP are buggy.

 Here is a proposed patch to fix it.

[...]

   # The backend doesn't need everything that's in LIBS, however
 ! LIBS := $(filter-out -lz -lreadline -ledit -ltermcap 
  -lncurses -lcurses -lldap_r $(PTHREAD_LIBS), $(LIBS))
 
 This seems pretty risky.  What if PTHREAD_LIBS contains -L switches?
 They'd get removed even if needed for other libraries.
 
 It would probably be safer not to put LDAP into LIBS at all, 
 but invent two new macros for configure to set, say LDAP_LIBS
 and LDAP_LIBS_R, and add these to the link lines in the backend
 and libpq respectively.

Here is a new patch that replaces the previous one; it adds two
macros LDAP_LIBS_FE and LDAP_LIBS_BE for frontend and backend,
respectively.

I did not only add them to the Makefile for interfaces/libpq,
but also everywhere something is linked against libpq in case
somebody links static.
I don't know if that's necessary, or if static builds are
supported - if not, the changes to those Makefiles should
perhaps not be applied.

Tested on Linux, AIX and Windows.

Yours,
Laurenz Albe


ldaplink2.patch
Description: ldaplink2.patch

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] Backend SSL configuration enhancement

2006-09-07 Thread Martijn van Oosterhout
On Tue, Sep 05, 2006 at 10:17:15AM +0400, Victor Wagner wrote:
 It's a pity that it's to late for patch to get into 8.2.
 It means that during all 8.2 lifecycle we'll have to maintain this patch
 separately.

Hmm? After 8.2 releases, if it's ready, it will go straight into CVS at
which point it'll be in the next release.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCHES] BUG #2600: dblink compile with SSL missing libraries

2006-09-07 Thread Tom Lane
Chris Browne [EMAIL PROTECTED] writes:
 [EMAIL PROTECTED] (Tom Lane) writes:
 No you don't --- see recent warthog complaint.  We have to filter LIBS
 down to just the minimum.

 I'm at a loss, then.

 - If LIBS is being filtered to the minimum, then shouldn't it be
   appropriate to add it in here?

No, LIBS isn't filtered at all.  See my recent commit to sslinfo's
Makefile --- I blew it just like this, you should learn from my mistake.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq