Re: [HACKERS] Patch for collation using ICU

2005-05-10 Thread Tatsuo Ishii
> Tatsuo Ishii wrote:
> > Sent: Tuesday, May 10, 2005 12:32 AM
> > To: John Hansen
> > Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; 
> > pgsql-hackers@postgresql.org
> > Subject: Re: [HACKERS] Patch for collation using ICU
> > 
> > > > -Original Message-
> > > > From: Tatsuo Ishii [mailto:[EMAIL PROTECTED]
> > > > Sent: Sunday, May 08, 2005 11:08 PM
> > > > To: John Hansen
> > > > Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; 
> > > > pgsql-hackers@postgresql.org
> > > > Subject: Re: [HACKERS] Patch for collation using ICU
> > > > 
> > > > > > I don't buy it. If current conversion tables does the
> > > > right thing,
> > > > > > why we need to replace. Or if conversion tables are not
> > > > correct, why
> > > > > > don't you fix it? I think the rule of character
> > > > conversion will not
> > > > > > change frequently, especially for LATIN languages. Thus
> > > > maintaining
> > > > > > cost is not too high.
> > > > > 
> > > > > I never said we need to, but if we're going to implement
> > > > ICU, then we
> > > > > might as well go all the way.
> > > > 
> > > > So you admit there's no benefit using ICU for replacing existing 
> > > > conversions?
> > > > 
> > > > Besides ICU does not support all existing conversions, I 
> > think ICU 
> > > > has serious flaw for using conversion. If I understand correctly, 
> > > > ICU uses UNICODE internally to do the conversion. For example, to 
> > > > implement
> > > > SJIS->EUC_JP conversion, ICU first converts SJIS to UNICODE then
> > > > converts UNICODE to EUC_JP. Problem is these conversion 
> > is not roud 
> > > > trip(conversion between SJIS/EUC_JP and UNICODE will lose some 
> > > > information). Thus SJIS->EUC_JP->SJIS conversion using 
> > ICU does not 
> > > > preserve original text.
> > > 
> > > Just for the record, I fetched a web page encoded in sjis, and 
> > > converted it to euc-jp and back using uconv from ICU 3.2, and the 
> > > result is the original is identical to the transformed file.
> > > 
> > >  uconv -f Shift_JIS -t EUC-JP -o index.html.euc index.html  
> > uconv -f 
> > > EUC-JP -t Shift_JIS -o index.html.sjis index.html.euc  diff 
> > index.html 
> > > index.html.sjis
> > 
> > Not all SJIS/EUC_JP characters have the problem. You might want to
> > try: Shift_JIS 0x81e6, 0x879a, 0xfa5b.
> > 
> > BTW, I got this with ICU 3.2:
> > 
> > $ uconv -f EUC_JP -t Shift_JIS /tmp/a.txt -o /tmp/b.txt 
> > Conversion from Unicode to codepage failed at input byte 
> > position 0. Unicode: 301c Error: Invalid character found
> > 
> > The contents of a.txt is 0xa1c1 which is a valid EUC_JP character.
> 
> That actually makes perfect sense, since according to unicode.org's
> database:
> 301C ~ WAVE DASH
>This character was encoded to match JIS C 6226-1978 1-33 "wave
> dash".
>The JIS standards and some industry practise disagree in mapping.
>- 3030 wavy dash
>- FF5E full width tilde
> 
> In PG FF5E is the mapping currently used. That is obviously wrong
> (according to the standards), as that is only a 'similar character'.
> 
> Unfortunately, there is no mapping from 301C to shift_jis, as shift_jis
> doesn't define "WAVE DASH".
> In all, I believe this behaviour to be correct according to the
> standards.
> 
> There'd be nothing to stop us from defining alternative mappings for the
> cases where we deviate from the standard, but the question is, should we
> be non-standard?

You missed the point. EUC_JP 0xa1c1 is a perfect valid data and 
uconv -f EUC_JP -t Shift_JIS should convert it to Shift_JIS 0x8160
regardless of the internal of uconv.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Case insensitive usernames

2005-05-10 Thread Magnus Hagander
> > Another way to help in this particular case would be to 
> have libpq on
> > win32 only force-lowercase the username IF it was retreived 
> from the 
> > system (but not when manually specified).
> 
> Well, I personally don't care how bizarrely the Win32 port 
> behaves ;-) so I won't complain if something like that 
> happens.  You should think twice though about whether 
> introducing this inconsistency is going to be a net win, or 
> whether it'll just move the confusion someplace else.

Windows is case-insensitive to usernames. Always has been, and I really
don't think it's about to change considering M$'s deal with backwards
compatiblity. I doubt if this would surprise anybody - at least anybody
on the win32 platform.

Actualyl, what usually surprises people is that GetUserName() returns
the case of the username *as the user entered it at login*, and *NOT* as
the administrator entered it in the Active Directory. Which can be said
to be a bug depending on how you look at it, but from a "username is
case insensitive" perspective it doesn't matter.

Note that we wouldn't case-change a userid that is actually specified by
the user (in the connectino string, on the commandline to psql).


> > Then if this was done the
> > kerberos username-matching code would just have to be relaxed to be 
> > case insensitive (which it really should be, because AFAIK 
> kerberos is 
> > supposed to be case insensitive),
> 
> This however bothers me; it seems like a potential security 
> hole (create kerberos principal FOO, use it to break into 
> Foo's account). 

Actually, to be specific, if you case-fold it he could get into "foo"
but not "Foo".

> Or does kerberos guarantee FOO and Foo are the same?

Did some further checking, and it turns out this depends on the KDC. All
KDCs are required to be case sensitive no the *REALM*. But I can't find
any documented requirements on the principal - just signs that different
KDCs treat it differently. Windows AD KDC for example is insensitive,
whereas I think it looks like MIT is case sensitive.

Which brings me back to thinking a GUC is the way to deal with that -
you'll definitly know what kind of KDC you have when you set up
Kerberos. But perhaps this GUC should be for "permit case-insensitive
kerberos principals" and not "case-insensitive usernames". And it would
just control the comparison between kerberos principal and user-supplied
username. The user-supplied username would still be what's used in any
access to the database, regardless of case.

Reasonable?


//Magnus

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Thomas Hallgren
Bruce Momjian wrote:
Josh Berkus wrote:
I think that private variables and private functions need to be part of the 
definition.

OK, so it seems we need:
C static/private functions for schemas
C static/private variables for schemas
Are private variables implemented via the temporary per-session schema?
nested schemas
What does the standard say?
The standard says that rather then using nested schemas or packages in 
conjunction with functions in order to group functions with data, use 
user defined types with attributes and methods. Methods can be STATIC, 
INSTANCE or CONSTRUCTOR.

AFAICS there's nothing in the standard that alters the visibility of 
methods and attributes, so "private" is not covered (and don't recall 
that Oracle has that kind of visibility control either). Normal access 
restrictions apply of course.

I can't find any mention of schema variables. I think all life-cycle 
management of data is reduced to table storage. And why not? A temporary 
table can be viewed as session data right?

Using a KISS approach, the easiest thing to do that also would bring us 
closer to the standard, is to extend the notion of user defined types to 
include methods and conclude that storing session data in other ways 
than using temporary tables should be PL specific.

Regards,
Thomas Hallgren
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Adrian Maier
-- Forwarded message --
From: Adrian Maier <[EMAIL PROTECTED]>
Date: May 10, 2005 12:01 PM
Subject: Re: [HACKERS] Oracle Style packages on postgres
To: "Jim C. Nasby" <[EMAIL PROTECTED]>


On 5/9/05, Jim C. Nasby <[EMAIL PROTECTED]> wrote:
> On Sun, May 08, 2005 at 10:38:41PM -0500, Bob wrote:
> > One simple benefit to packages is just organization of related code.
>
> Which, IMHO, is greatly diminished by the lack of
> schema.package.function notation. BTW, the original post referred to
> this as user.package.function, but I believe that technically it's
> actually schema.package.function (Oracle tends to mix schemas and
> users). In any case, schema.package.function is what would make sense in
> PostgreSQL.
>
> Personally, I think the biggest win here would be adding package support
> and syntax to plpgsql. Not only would it make porting from Oracle
> easier, it would also make plpgsql much, much more powerful.

Hello,

What do you think about having some kind of language-independent
packages ?
I'm thinking that it could be handy to implement some functions in
plpgsql, some functions in plpython and so . And then bundle them
together into the same package.

Cheers,
Adrian Maier

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Patch for collation using ICU

2005-05-10 Thread John Hansen
Tatsuo Ishii wrote:
> Sent: Tuesday, May 10, 2005 5:45 PM
> To: John Hansen
> Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; 
> pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Patch for collation using ICU
> 
> > Tatsuo Ishii wrote:
> > > Sent: Tuesday, May 10, 2005 12:32 AM
> > > To: John Hansen
> > > Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; 
> > > pgsql-hackers@postgresql.org
> > > Subject: Re: [HACKERS] Patch for collation using ICU
> > > 
> > > > > -Original Message-
> > > > > From: Tatsuo Ishii [mailto:[EMAIL PROTECTED]
> > > > > Sent: Sunday, May 08, 2005 11:08 PM
> > > > > To: John Hansen
> > > > > Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; 
> > > > > pgsql-hackers@postgresql.org
> > > > > Subject: Re: [HACKERS] Patch for collation using ICU
> > > > > 
> > > > > > > I don't buy it. If current conversion tables does the
> > > > > right thing,
> > > > > > > why we need to replace. Or if conversion tables are not
> > > > > correct, why
> > > > > > > don't you fix it? I think the rule of character
> > > > > conversion will not
> > > > > > > change frequently, especially for LATIN languages. Thus
> > > > > maintaining
> > > > > > > cost is not too high.
> > > > > > 
> > > > > > I never said we need to, but if we're going to implement
> > > > > ICU, then we
> > > > > > might as well go all the way.
> > > > > 
> > > > > So you admit there's no benefit using ICU for 
> replacing existing 
> > > > > conversions?
> > > > > 
> > > > > Besides ICU does not support all existing conversions, I
> > > think ICU
> > > > > has serious flaw for using conversion. If I understand 
> > > > > correctly, ICU uses UNICODE internally to do the 
> conversion. For 
> > > > > example, to implement
> > > > > SJIS->EUC_JP conversion, ICU first converts SJIS to 
> UNICODE then
> > > > > converts UNICODE to EUC_JP. Problem is these conversion
> > > is not roud
> > > > > trip(conversion between SJIS/EUC_JP and UNICODE will 
> lose some 
> > > > > information). Thus SJIS->EUC_JP->SJIS conversion using
> > > ICU does not
> > > > > preserve original text.
> > > > 
> > > > Just for the record, I fetched a web page encoded in sjis, and 
> > > > converted it to euc-jp and back using uconv from ICU 
> 3.2, and the 
> > > > result is the original is identical to the transformed file.
> > > > 
> > > >  uconv -f Shift_JIS -t EUC-JP -o index.html.euc index.html
> > > uconv -f
> > > > EUC-JP -t Shift_JIS -o index.html.sjis index.html.euc  diff
> > > index.html
> > > > index.html.sjis
> > > 
> > > Not all SJIS/EUC_JP characters have the problem. You might want to
> > > try: Shift_JIS 0x81e6, 0x879a, 0xfa5b.
> > > 
> > > BTW, I got this with ICU 3.2:
> > > 
> > > $ uconv -f EUC_JP -t Shift_JIS /tmp/a.txt -o /tmp/b.txt 
> Conversion 
> > > from Unicode to codepage failed at input byte position 0. 
> Unicode: 
> > > 301c Error: Invalid character found
> > > 
> > > The contents of a.txt is 0xa1c1 which is a valid EUC_JP character.
> > 
> > That actually makes perfect sense, since according to unicode.org's
> > database:
> > 301C ~ WAVE DASH
> >This character was encoded to match JIS C 6226-1978 
> 1-33 "wave 
> > dash".
> >The JIS standards and some industry practise 
> disagree in mapping.
> >  - 3030 wavy dash
> >  - FF5E full width tilde
> > 
> > In PG FF5E is the mapping currently used. That is obviously wrong 
> > (according to the standards), as that is only a 'similar character'.
> > 
> > Unfortunately, there is no mapping from 301C to shift_jis, as 
> > shift_jis doesn't define "WAVE DASH".
> > In all, I believe this behaviour to be correct according to the 
> > standards.
> > 
> > There'd be nothing to stop us from defining alternative 
> mappings for 
> > the cases where we deviate from the standard, but the question is, 
> > should we be non-standard?
> 
> You missed the point. EUC_JP 0xa1c1 is a perfect valid data 
> and uconv -f EUC_JP -t Shift_JIS should convert it to 
> Shift_JIS 0x8160 regardless of the internal of uconv.

Studying ICU forther, I found that it works fine, provided you use the
_correct_ charset for the conversion..

a.txt contains 0x81 0x60
uconv -f ibm-943_P130-1999 -t EUC_JP a.txt -o b.txt
b.txt now contains 0xa1 0xc1
uconv -t ibm-943_P130-1999 -f EUC_JP b.txt -o a.txt
a.txt still contains 0x81 0x60

The mapping table you want is ibm-943_P130-1999
Similar, we'd need to find the right euc-jp (and plain jis) mapping,
assuming we want the one that strictly defines JIS X 0208 right?

I trust this to put your fears to rest...

> --
> Tatsuo Ishii
> 
> 

... John

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] request for sql3 compliance for the update command

2005-05-10 Thread Bruce Momjian
Greg Stark wrote:
> 
> Tom Lane <[EMAIL PROTECTED]> writes:
> 
> > UPDATE totals SET
> >   xmax = ss.xmax, xmin = ss.xmin, ...
> > FROM
> >   (SELECT groupid, max(x) AS xmax, ... FROM details GROUP BY groupid) ss
> > WHERE groupid = ss.groupid;
> > 
> ...
> > 
> > Of course this syntax isn't standard either ... but we already have it.
> 
> 
> Did this patch ever make it in? It's not documented in the 8.0 documentation
> for UPDATE at:
> 
> http://www.postgresql.org/docs/8.0/interactive/dml-update.html

It is documented only in the UPDATE manual page because it is fairly exotic:

http://www.postgresql.org/docs/8.0/interactive/sql-update.html


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Table Partitioning, Part 1

2005-05-10 Thread Hannu Krosing
On E, 2005-05-09 at 23:30 +0100, Simon Riggs wrote:
> A more in-depth consideration of the major design options and trade-offs
> available to us... this is an internals related discussion.
> 
> Comments?
> 
> 1. Embellish inheritance or separate infrastructure?
> 
> Inheritance is a somewhat strange PostgreSQL feature, with a few
> gotchas. The "big one" is the lack of a multi-relation index that can be
> used against all of the inherited relations also.

I think this can be solevd in  two ways - 

1. create a "merged index" which is similar to a partitioned table,
where inserts go to the index corresponding to underlying table but
selects (including uniqueness checks) are done by merging data from
several indexes. This also solves the problem of fast adding and
deleting the partitions to/from the global index.

2. second is variation of my old suggestion of using the fact that we
have 1G storage tables for implementing "partitioned tables" - just make
a mapping from each 1G (128kpage) range to subtable/offset and then
build the index as usual. This has problems with max partitioned table
size (same as our current max for ordinary tables) and also slow adding
of new partitions - deleting can be done fast by mapping the allocated
range to 'removed'.

If 1. can be implemented reasonably efficiently for large number of
merged indexes then there is no point in investigating 2. .

> The full complexity of multiple inheritance, possible multi-level
> inheritance doesn't seem to solve any problems for us. The main argument
> in favour of improving inheritance as a route to partitioning is that
> many people have already developed applications using it to provide some
> of the desirable aspects of partitioning. The amount of code used to
> provide some of these features is actually quite low, and the code
> itself is already specialised. 
> 
> It seems prudent to avoid building on that foundation, even though we
> may decide to use some similar approaches.

I would not mind if we give up multiple inheritance. 

OTOH multi-level inheritance (and thus multi-level partitioning) can be
quite useful sometimes. For example I could have old data partitioned by
YEAR, but newer also sub-partitioned by month or even day.

Then I could manually PE all previous year by simply doing my query
against sales_2005 and let the automatic PE worry for doing the right
thing within this year.

Also, the multi-level nature of inheritance should be hidden from
planner/optimiser by moving the sub-sub-sub tables up the hierarchy into
a single flat append-like plan. 

At least until/if we start applying check constraints to table and its
all child tables, in which case some of the PE could be done on
intermediate nodes before expansion.

> 2. Individual Relations explicitly in the plan or MultiRelation plan
> nodes? (i.e. is a SeqScan of a Partitioned Table one Node or many
> nodes?)
> 
> Currently, Inheritance produces a plan for a SeqScan that looks like
> this...
> 
> Append
>   SeqScan
>   SeqScan
>   SeqScan

Seems reasonable for a seqscan :)

I have an inheritance-paritioned table of about 50 tables and when I
restrict it on some column that has an index then the plan does index-
scan on bigger tables and seqscan with some old tables which have only a
few thousands of rows,

> ISTM fairly straightforward to produce a similar "static" plan along the
> same lines, using Result nodes to implement Partition Elimination.
> 
> Append
>   Result
> SeqScan
>   Result
> SeqScan
>   Result
> SeqScan

So you mean another node that looks inside seqscans restrictions and
then determines if there is any chance that seqscan will return any
rows ? 

Why can't this be done at planning time ? And if it can't be done at
planning time, then how do you determine which plan is cheapest ?

> Some may think that looks a little clunky, especially since the plan
> length would increase as the number of Partitions increased. An
> alternative, would be to have a single Scan node that performs a Multi-
> Partition Scan, which looks like this...
> 
> PartitionSeqScan
> 
> There'd always one node, no matter how many partitions. Again, easy to
> implement, since we just add a new type of Scan access method, or alter
> the existing ones iff we access partitioned tables. We'd be able to use
> physical tlists, and all the complexity is hidden.

Pehaps we could have just one scan type which looks like this

Scan

And hide all the complexity within it ? ;p

Greater visibility of internals is good, both for optimiser and for user
trying to understand what is going on (explain analyse). 

I don't think many nodes are a problem per se, only when they cause some
non-linear growth in some operations.

>
...
> 
> My current thinking is to implement explicitly partition plans.

Agreed.



As a general guideline I think it is (almost) never a good idea to trade
CPU cycles for disk accesses.

The speed difference between CPU and disk seems destined to grow in the
fors

Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Bruce Momjian
Thomas Hallgren wrote:
> Bruce Momjian wrote:
> > Josh Berkus wrote:
> > 
> >>I think that private variables and private functions need to be part of the 
> >>definition.
> > 
> > 
> > OK, so it seems we need:
> > 
> > C static/private functions for schemas
> > C static/private variables for schemas
> > 
> > Are private variables implemented via the temporary per-session schema?
> > 
> > nested schemas
> > 
> > What does the standard say?
> > 
> The standard says that rather then using nested schemas or packages in 
> conjunction with functions in order to group functions with data, use 
> user defined types with attributes and methods. Methods can be STATIC, 
> INSTANCE or CONSTRUCTOR.

So it sounds like you group the functions into user-defined types,
rather than nested schemas.  So you assocate functions with a table?

> AFAICS there's nothing in the standard that alters the visibility of 
> methods and attributes, so "private" is not covered (and don't recall 
> that Oracle has that kind of visibility control either). Normal access 
> restrictions apply of course.
> 
> I can't find any mention of schema variables. I think all life-cycle 
> management of data is reduced to table storage. And why not? A temporary 
> table can be viewed as session data right?
> 
> Using a KISS approach, the easiest thing to do that also would bring us 
> closer to the standard, is to extend the notion of user defined types to 
> include methods and conclude that storing session data in other ways 
> than using temporary tables should be PL specific.

I suppose.  I think we should focus on the use cases for Oracle
packages, rather than the specific functionality it provides.  What
things do people need PostgreSQL to do that it already doesn't do?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Table Partitioning, Part 1

2005-05-10 Thread Hannu Krosing
On T, 2005-05-10 at 16:31 +0300, Hannu Krosing wrote:
> On E, 2005-05-09 at 23:30 +0100, Simon Riggs wrote:

> There are 2 possibly expensive steps: 
> 
> 1. the conversion to "AND'ed list of simple clauses" (unknown
> complexity) 
> 
> 2. matching each of "simple" clauses in the and list with all others
> (should be N+(N-1)+(N-2)+..+(1) ~= 2N) complexity)

actually not 2N but (N * ((N-1)/2) , thus 3 clauses need 2+1=3 checks
and 11 clasues need (10+9+..+1) = 55 checks.

-- 
Hannu Krosing <[EMAIL PROTECTED]>


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Case insensitive usernames

2005-05-10 Thread Tom Lane
"Magnus Hagander" <[EMAIL PROTECTED]> writes:
> Which brings me back to thinking a GUC is the way to deal with that -
> you'll definitly know what kind of KDC you have when you set up
> Kerberos. But perhaps this GUC should be for "permit case-insensitive
> kerberos principals" and not "case-insensitive usernames". And it would
> just control the comparison between kerberos principal and user-supplied
> username. The user-supplied username would still be what's used in any
> access to the database, regardless of case.

That would work for me as long as the default is case-sensitive; the
other seems too likely to be a security hazard.  (And it had better be
documented that way, too: "DO NOT turn this on unless you are certain
you are using a case-insensitive KDC.")

What will we call the GUC?  kerberos_case_insensitive_principals
seems a bit, um, verbose.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-10 Thread Mark Cave-Ayland

> -Original Message-
> From: Mark Cave-Ayland [mailto:[EMAIL PROTECTED]
> Sent: 07 March 2005 11:04
> To: '[EMAIL PROTECTED]'
> Cc: 'pgsql-hackers@postgreSQL.org'
> Subject: Re: Cost of XLogInsert CRC calculations

(cut)

> > I suppose that the bulk of the CPU cycles being attributed to
> > XLogInsert are going into > the inlined CRC calculations.  Maybe we 
> > need to think twice about the cost/benefit ratio > of using 64-bit 
> > CRCs to protect xlog records that are often only a few dozen bytes.
> 
> Wow, a 64-bit CRC does seem excessive, especially when going
> back to Zmodem days where a 50-100k file seemed to be easily 
> protected by a 32-bit CRC. I'm sure there are some error 
> rates somewhere dependent upon the polynomial and the types 
> of error detected Try the following link towards the 
> bottom: http://www.ee.unb.ca/tervo/ee4253/crc.htm for some 
> theory on detection rates vs. CRC size.


Hi Tom/Simon,

I was just researching some articles on compression (zlib) and I saw mention
of the Adler-32 algorithm which is supposed to be slightly less accurate
than an equivalent CRC calculation but significantly faster to compute. I
haven't located a good paper comparing the error rates of the two different
checksums, however extending the Adler-32 algorithm up to 64 bits may be a
way of increasing the speed at the expense of a slight loss in the accuracy
of error detection if we decided that we want to stay at 64 bits as opposed
to 32. 

The following seems to indicate that Adler-32 is at least twice as fast as
optimised CRC32: http://www.winimage.com/misc/readfile_test.htm.


Kind regards,

Mark.


WebBased Ltd
17 Research Way
Plymouth
PL6 8BT 

T: +44 (0)1752 791021
F: +44 (0)1752 791023
W: http://www.webbased.co.uk



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-10 Thread Bruce Momjian
Tom Lane wrote:
> "Mark Cave-Ayland" <[EMAIL PROTECTED]> writes:
> > I was just researching some articles on compression (zlib) and I saw mention
> > of the Adler-32 algorithm which is supposed to be slightly less accurate
> > than an equivalent CRC calculation but significantly faster to compute. I
> > haven't located a good paper comparing the error rates of the two different
> > checksums,
> 
> ... probably because there isn't one.  With all due respect to the Zip
> guys, I doubt anyone has done anywhere near the analysis on Adler-32
> that has been done on CRCs.  I'd much prefer to stick with true CRC
> and drop it to 32 bits than go with a less-tested algorithm.  Throwing
> more bits at the problem doesn't necessarily create a safer checksum.

Agreed.  64-bit was overkill when we added it, and it is now shown to be
a performance problem.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-10 Thread Tom Lane
"Mark Cave-Ayland" <[EMAIL PROTECTED]> writes:
> I was just researching some articles on compression (zlib) and I saw mention
> of the Adler-32 algorithm which is supposed to be slightly less accurate
> than an equivalent CRC calculation but significantly faster to compute. I
> haven't located a good paper comparing the error rates of the two different
> checksums,

... probably because there isn't one.  With all due respect to the Zip
guys, I doubt anyone has done anywhere near the analysis on Adler-32
that has been done on CRCs.  I'd much prefer to stick with true CRC
and drop it to 32 bits than go with a less-tested algorithm.  Throwing
more bits at the problem doesn't necessarily create a safer checksum.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Thomas Hallgren
Bruce Momjian wrote:
Thomas Hallgren wrote:
 

Bruce Momjian wrote:
   

Josh Berkus wrote:
 

I think that private variables and private functions need to be part of the 
definition.
   

OK, so it seems we need:
C static/private functions for schemas
C static/private variables for schemas
Are private variables implemented via the temporary per-session schema?
nested schemas
What does the standard say?
 

The standard says that rather then using nested schemas or packages in 
conjunction with functions in order to group functions with data, use 
user defined types with attributes and methods. Methods can be STATIC, 
INSTANCE or CONSTRUCTOR.
   

So it sounds like you group the functions into user-defined types,
rather than nested schemas.
Yes, you'd get .. as the fully qualified name.
 So you assocate functions with a table?
 

Not necessarily a table. A type is just a type. A type containing just 
static methods is not very different from an Oracle package. A table 
created from a type may of course have methods associated with it. That 
gets really interesting when you use INSTANCE methods. They act on a per 
row basis so that you can do things like:

SELECT x.someMethod() FROM someTable x;
rather than as today.
SELECT someFunction(x) FROM someTable x;
Regards,
Thomas Hallgren

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Daniel Schuchardt
Bruce Momjian schrieb:
OK, so it seems we need:
C static/private functions for schemas
C static/private variables for schemas
Are private variables implemented via the temporary per-session schema?
nested schemas
What does the standard say?
Is that it?
 

Yeah,
that would be great. And don't forget global variables for pl/pgsql.
Daniel
---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] Please clarify

2005-05-10 Thread palanivel . kumaran




Hai,

I need a clarification for the below:

I need to check for the existence of a user defined view named 'audit_vw'
and if exists, then i need to delete the same. Please help me to solve the
issue.

Thanks & Regards
Palanivel P.K

Important Email Information :- The  information  in  this  email is
confidential and may  be  legally  privileged. It  is  intended  solely for
the addressee. Access to  this email  by anyone  else  is  unauthorized.  If
you are not the intended recipient, any disclosure, copying, distribution or
any action taken or omitted to be taken in reliance on it, is prohibited
and may be unlawful. If you are not the intended addressee please contact
the sender and dispose of this e-mail immediately.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Views, views, views! (long)

2005-05-10 Thread Peter Eisentraut
Am Montag, 9. Mai 2005 00:41 schrieb Andrew - Supernews:
> >> c) In most places, "system" objects are segregated from
> >> "user" objects,  e.g. pg_user_indexes
> >
> > I think that is a bad idea as it goes against the fundamental design of
> > PostgreSQL.
>
> In what way? Please elaborate.

PostgreSQL does not really distinguish between "system" and "user" things.  
How will you do that?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Views, views, views! (long)

2005-05-10 Thread Peter Eisentraut
Am Freitag, 6. Mai 2005 12:20 schrieb Andreas Pflug:
> > and the information_schema is next to useless for these things since
> >  it doesn't have PostgreSQL specific things in it.
>
> And the restriction to current user owned objects reduces usability to
> zero.

The information schema restricts the views to the objects to which you have 
some access right, which doesn't seem all that useless.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Dave Held
> -Original Message-
> From: Bruce Momjian [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, May 10, 2005 8:43 AM
> To: Thomas Hallgren
> Cc: Tom Lane; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Oracle Style packages on postgres
> 
> [...]
> I suppose.  I think we should focus on the use cases for Oracle
> packages, rather than the specific functionality it provides. 
> What things do people need PostgreSQL to do that it already
> doesn't do?

Is that really the best way to go about things?  Already RDBMSes
are patchwork quilts of functionality.  Is merely adding another
patch the most elegant way to evolve the database?  The problem is
that Oracle et al are trying to be ORDBMSes and aren't exactly sure
what the best way to go is.  Instead of trying to formulate a 
rational plan for what an ORDBMS should even look like, they simply
look at what would work with their existing infrastructure and tack
on features.  Then Postgres plays the copycat game.  Instead of
trying to play catch-up with Oracle, why not beat them at their own
game?

What packages provide is encapsulation.  Hiding the data from the
user and forcing him/her to use the public interface (methods).
That is an important and admirable OO feature.  Some people think
that using the DB's security model can achieve the same thing.  It
can't, exactly, but there's an important lesson to be learned from
the suggestion.  The problem is that OOP is a *programming* paradigm,
and a database is not a *programming language*.  In a programming
language, there really is no such thing as "security".  There is 
only "visibility" and "accessibility".  Private methods in an OOP
language do not provide *security*; they only limit *accessibility*.
Like so many other differences between the relational model and the
OOP model, there is an impedance mismatch here.  However, there is
also opportunity.

In an OOPL, you can say: "Users can call this method from here, but
not from there."  What you *can't* say is: "User X can call this
method, but User Y cannot."  As you can see, these are orthogonal
concepts.  You could call the first "accessibility by location" and
the second "accessibility by authentication".  An ORDBMS should
support both.  "Private" does not respect your identity, only your
calling location.  An ACL does not respect your calling scope, only
your identity.  A system that has both is clearly more flexible than
one that only has one or the other.

Now what you need to keep in mind is that each visibility model 
serves a different purpose.  The purpose of a security model is to 
limit *who* can see/touch certain data because the data has intrinsic 
value.  The purpose of an accessibility model is to limit *where* and 
*how* data can be seen/touched in order to preserve *program 
invariants*.  So if you have an object (or tuple!) that records the 
start and stop time of some process, it is probably a logical 
invariant that the stop time is greater than or equal to the start 
time.  For this reason, in a PL, you would encapsulate these fields 
(attributes) and only provide controlled access to update them that 
checks and preserves the invariant, *no matter who you are*.  You 
don't want a superuser violating this invariant any more than Sue 
User.

Now you might object that constraints allow you to preserve 
invariants as well, and indeed they do.  But constraints do not
respect calling scope.  Suppose there is a process that needs to
update the timestamps in a way that temporarily breaks the invariant
but restores it afterwards.  The only way to effect this in a
constraint environment is to drop the constraint, perform the
operation, and restore it.  However, dropping a constraint is not an
ideal solution because there may be other unprivileged processes 
operating on the relation that still need the constraint to be 
enforced.  There is no way to say: "There is a priviledged class of 
methods that is allowed to violate this constraint because they are 
trusted to restore it upon completion."  Note that this is different
from saying "There is a priviledged class of users that is allowed
to violate this constraint."  If you try to do something like give
read-only access to everybody and only write access to one user and
define that user to be the owner of the methods that update the data,
you have to follow the convention that that user only operates 
through the defined interface, and doesn't hack the data directly.
That's because user-level accessibility is not the same as scope-
level accessibility.  Whereas, if you define something like a
package, and say: "Package X is allowed full and complete access
to relation Y", and stick the interface methods in X, you still have
all the user-level security you want while preserving the invariants
in the most elegant way.

So you can think of a package as a scope in a programming language.
It's like a user, but it is not a user.  A user has privileges that
cut across scopes.  Now

Re: [HACKERS] Views, views, views! (long)

2005-05-10 Thread Merlin Moncure
> Am Freitag, 6. Mai 2005 12:20 schrieb Andreas Pflug:
> > > and the information_schema is next to useless for these things
since
> > >  it doesn't have PostgreSQL specific things in it.
> >
> > And the restriction to current user owned objects reduces usability
to
> > zero.
> 
> The information schema restricts the views to the objects to which you
> have
> some access right, which doesn't seem all that useless.

Not useless at all, I couldn't' agree with you more. In fact, there is a
good security argument to be made here as well.  In the current state of
things, a user with no rights at all can see the entire database schema
including all the source code to the stored procedures.  There are means
to circumvent this but they are very coarse and can cause unexpected
problems.

I think the information_schema approach to system metadata is ultimately
the correct one...but I also agree with Josh in that the various client
tools such as pgadmin and perhaps even pg_dump should be transitioned to
using views.  After that the true system columns will no longer need
public access and everything becomes very elegant.  To me, this is a
much finer grained security model and nobody complains about extra
security features.

Merlin



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Case insensitive usernames

2005-05-10 Thread Magnus Hagander
> > Which brings me back to thinking a GUC is the way to deal 
> with that - 
> > you'll definitly know what kind of KDC you have when you set up 
> > Kerberos. But perhaps this GUC should be for "permit 
> case-insensitive 
> > kerberos principals" and not "case-insensitive usernames". And it 
> > would just control the comparison between kerberos principal and 
> > user-supplied username. The user-supplied username would still be 
> > what's used in any access to the database, regardless of case.
> 
> That would work for me as long as the default is 
> case-sensitive; the other seems too likely to be a security 
> hazard.  (And it had better be documented that way, too: "DO 
> NOT turn this on unless you are certain you are using a 
> case-insensitive KDC.")

Fine with me - you'll need to tweak the default principal name anyway to
work with the windwos KDC, so you're giong there anyawy. It's just a
matter of documenting it.


> What will we call the GUC?  kerberos_case_insensitive_principals
> seems a bit, um, verbose.
All other kerberos parameters are krb_ and not kerberos_, so that saves
a bit :) How about just "krb_case_insensitive"? Or "krb_case_ins_princ"?

//Magnus

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Please clarify

2005-05-10 Thread Christopher Kings-Lynne
Try selecting from pg_views to see if it exists, then if it does, drop it.
Chris
[EMAIL PROTECTED] wrote:

Hai,
I need a clarification for the below:
I need to check for the existence of a user defined view named 'audit_vw'
and if exists, then i need to delete the same. Please help me to solve the
issue.
Thanks & Regards
Palanivel P.K
Important Email Information :- The  information  in  this  email is
confidential and may  be  legally  privileged. It  is  intended  solely for
the addressee. Access to  this email  by anyone  else  is  unauthorized.  If
you are not the intended recipient, any disclosure, copying, distribution or
any action taken or omitted to be taken in reliance on it, is prohibited
and may be unlawful. If you are not the intended addressee please contact
the sender and dispose of this e-mail immediately.
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] Hashagg planning bug (8.0.1)

2005-05-10 Thread Rod Taylor
It would seem that the planner does not take into account whether the
datatypes involved have the capability to use hash aggregates or not.

sdb=# explain select query_start, current_query from pg_locks join 
stat_activity on pid = procpid where granted = true and transaction in (select 
transaction from pg_locks where granted = false);
ERROR:  could not find hash function for hash operator 716373

ssdb=# set enable_hashagg = off;
SET

ssdb=# explain select query_start, current_query from pg_locks join 
stat_activity on pid = procpid where granted = true and transaction in (select 
transaction from pg_locks where granted = false);
 QUERY PLAN

 Hash Join  (cost=60.79..159.12 rows=1244 width=40)
   Hash Cond: ("outer".procpid = "inner".pid)
   ->  Function Scan on stat_activity  (cost=0.00..15.00 rows=995 width=44)
 Filter: (current_query <> ''::text)
   ->  Hash  (cost=60.16..60.16 rows=250 width=4)
 ->  Hash Join  (cost=40.16..60.16 rows=250 width=4)
   Hash Cond: ("outer"."transaction" = "inner"."transaction")
   ->  Function Scan on pg_lock_status l  (cost=0.00..15.00 
rows=500 width=8)
 Filter: (granted = true)
   ->  Hash  (cost=39.91..39.91 rows=100 width=4)
 ->  Unique  (cost=37.41..39.91 rows=100 width=4)
   ->  Sort  (cost=37.41..38.66 rows=500 width=4)
 Sort Key: l."transaction"
 ->  Function Scan on pg_lock_status l  
(cost=0.00..15.00 rows=500 width=4)
   Filter: (granted = false)
(15 rows)

-- 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Hashagg planning bug (8.0.1)

2005-05-10 Thread Tom Lane
Rod Taylor <[EMAIL PROTECTED]> writes:
> It would seem that the planner does not take into account whether the
> datatypes involved have the capability to use hash aggregates or not.

> sdb=# explain select query_start, current_query   from pg_locks join 
> stat_activity on pid = procpid where granted = true and transaction in 
> (select transaction from pg_locks where granted = false);
> ERROR:  could not find hash function for hash operator 716373

What's stat_activity?  I thought you meant pg_stat_activity, but that
works fine here.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] Adding callback support.

2005-05-10 Thread Nicolai Petri
Hi ppl,

I'm currently building some stored procedures in C that uses some internal 
hash tables - It could be really nice to be able to deallocate those 
correctly when e.g. a memctx is destroyed. Would it be possible to add this 
as a postgresql feature and how should it be done.

I also have some other tasks where I would like to do specific actions when a 
transaction completes or aborts. This could be used for maintaining a 
pgmemcache correctly and many other cases. Could this be combined with the 
above feature so a generic callback framework was added ? 

I was thinking something like :
 register_callback(my_func, (void*) myprivatedata, 
 CB_ON_TRNSACT_OK|CB_ON_TRNSACT_ABORT );

Other flags could be on CB_ON_QUERY_DONE/ABORTED, CB_ON_CONNETION_CLOSED

I wouldn't mind trying to do patches if they would be accepted for 8.1 or 8.2 
but I would if we could settle for an API before I start coding anything.

---
Nicolai Petri




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Hashagg planning bug (8.0.1)

2005-05-10 Thread Rod Taylor
On Tue, 2005-05-10 at 12:11 -0400, Tom Lane wrote:
> Rod Taylor <[EMAIL PROTECTED]> writes:
> > It would seem that the planner does not take into account whether the
> > datatypes involved have the capability to use hash aggregates or not.
> 
> > sdb=# explain select query_start, current_query from pg_locks join 
> > stat_activity on pid = procpid where granted = true and transaction in 
> > (select transaction from pg_locks where granted = false);
> > ERROR:  could not find hash function for hash operator 716373
> 
> What's stat_activity?  I thought you meant pg_stat_activity, but that
> works fine here.

Oh, stat_activity is a view which removes idle connections from
displaying and allows non-privileged users to see everything that's
going on within the DB.

CREATE OR REPLACE FUNCTION stat_activity()
  RETURNS setof pg_stat_activity
  SECURITY DEFINER
  AS 'select * from pg_stat_activity; '
  language sql;

CREATE OR REPLACE VIEW stat_activity AS
 SELECT stat_activity.procpid, stat_activity.usename,
stat_activity.query_start::timestamp(0) without time zone AS
query_start, stat_activity.current_query
   FROM stat_activity()
  WHERE stat_activity.current_query <> ''::text;

-- 


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Hashagg planning bug (8.0.1)

2005-05-10 Thread Tom Lane
Rod Taylor <[EMAIL PROTECTED]> writes:
> Oh, stat_activity is a view which removes idle connections from
> displaying and allows non-privileged users to see everything that's
> going on within the DB.

Still works fine for me.  Do you even have an operator 716373?
If so what is it?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Views, views, views! (long)

2005-05-10 Thread Josh Berkus
Peter, Merlin, Andrew,

> > > And the restriction to current user owned objects reduces usability
> to
> > > zero.
> >
> > The information schema restricts the views to the objects to which you
> > have
> > some access right, which doesn't seem all that useless.

There's a difference between restricting it to objects on which you have 
rights (which our views do as well) vs. restricting it to objects you *own*, 
which is what the information_schema does, according to Andrew.  Yes?

More in next e-mail.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Hashagg planning bug (8.0.1)

2005-05-10 Thread Rod Taylor
On Tue, 2005-05-10 at 12:50 -0400, Tom Lane wrote:
> Rod Taylor <[EMAIL PROTECTED]> writes:
> > Oh, stat_activity is a view which removes idle connections from
> > displaying and allows non-privileged users to see everything that's
> > going on within the DB.
> 
> Still works fine for me.  Do you even have an operator 716373?
> If so what is it?

It's the = operator that Slony adds for xxid comparisons. I didn't even
think of changes Slony would have made.

ssdb=# select * from pg_operator where oid = 716373;
 oprname | oprnamespace | oprowner | oprkind | oprcanhash | oprleft | oprright 
| oprresult | oprcom | oprnegate | oprlsortop | oprrsortop | oprltcmpop | 
oprgtcmpop |oprcode| oprrest |  oprjoin
-+--+--+-++-+--+---++---+++++---+-+---
 =   | 2200 |  588 | b   | t  |  716353 |   716353 
|16 | 716373 |716372 | 716371 | 716371 | 716371 | 
716369 | _ssrep.xxideq | eqsel   | eqjoinsel
(1 row)

-- 


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Hashagg planning bug (8.0.1)

2005-05-10 Thread Tom Lane
Rod Taylor <[EMAIL PROTECTED]> writes:
> It's the = operator that Slony adds for xxid comparisons. I didn't even
> think of changes Slony would have made.

> ssdb=# select * from pg_operator where oid = 716373;
>  oprname | oprnamespace | oprowner | oprkind | oprcanhash | oprleft | 
> oprright | oprresult | oprcom | oprnegate | oprlsortop | oprrsortop | 
> oprltcmpop | oprgtcmpop |oprcode| oprrest |  oprjoin
> -+--+--+-++-+--+---++---+++++---+-+---
>  =   | 2200 |  588 | b   | t  |  716353 |   
> 716353 |16 | 716373 |716372 | 716371 | 716371 | 
> 716371 | 716369 | _ssrep.xxideq | eqsel   | eqjoinsel
> (1 row)

I think you need to have a word with the Slony boys.  They shouldn't be
marking the operator oprcanhash if they aren't providing a valid hash
opclass for the datatype.  Per the manual:

: To be marked HASHES, the join operator must appear in a hash index
: operator class. This is not enforced when you create the operator, since
: of course the referencing operator class couldn't exist yet. But
: attempts to use the operator in hash joins will fail at runtime if no
: such operator class exists. The system needs the operator class to find
: the data-type-specific hash function for the operator's input data
: type. Of course, you must also supply a suitable hash function before
: you can create the operator class.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Views, views, views: Summary of Arguments

2005-05-10 Thread Josh Berkus
Folks,

We've meandered a bit on this, so I wanted to summarize the arguments 
presented on the new system views to date so that we might have some hope of 
consensus before feature freeze.   

As I see it, there are 3 main arguments about having the new system views at 
all.  These obviously need to be settled before we go any further on security 
models, column names, etc.   Please add if I've missed anyone's arguments, 
I'm trying to summarize across 2 weeks of discussion and am obviously not 
impartial.

Argument (1):  Are the views useful to users?
Pro: Several people, particularly the proposers, contend that they are.  They 
cite as evidence the popularity of related articles on General Bits, 
commercial precedent, and the prevalence of user-created system views.  
Mostly, the usefulness is aimed at new users.
Con: A few people say that they are not useful, and that the system tables are 
easily understood.

Argument (2): Do they provide sufficiently distinct functionality from the 
information_schema?
Pro:  The proposers contend that the information_schema, by SQL spec, is 
unable to show all PostgreSQL objects in sufficient detail.   That the 
permissions and uniqueness models are wrong for PostgreSQL, and these things 
are not easily fixed by extension without breaking the SQL spec.  That we 
don't want to confuse the information_schema with PostgreSQL-specific 
extensions.
Con: Several people, most notably Peter, contend that much of the new system 
views are duplicative of information_schema, and that efforts should be made 
to extend infomation_schema instead of providing a parallel interface.  That 
we should make serious efforts to support a standard rather than developing a 
proprietary interface.  A few people claimed that there was nothing that 
information_schema didn't have, or that users didn't need that information 
anyway.

Argument (3): Would the new system views be useful to interface designers?
Pro:  Christopher Kings-Lynne said yes for phpPgAdmin.  Josh argued that we 
need to look at interface designers who are designing for 3rd-party 
multi-database products who are not supporting PostgreSQL yet and will be 
unlikely to learn the system tables. 
Con:  Dave Page said no for pgAdmin.  Several people pointed out issues with 
the idea of maintaining backwards compatibility through abstraction.  Others 
cited argument (2) in favor of information_schema, above.

... thus, as I see it, the *primary* question is in fact argument (2).  That 
is, is information_schema sufficient, and if not, can it be extended without 
breaking SQL standards?   Argument (1) did not seem to have a lot of evidence 
on the "con" side, and the strongest argument against (3) is that we should 
use information_schema.

Andrew, can you do a more cohesive set of points on the 2nd half of that 
question?  That is, how much SQL spec would we have to break (other than 
extension) to cover all of the stuff that pg_sysviews currently covers?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Views, views, views: Summary of Arguments

2005-05-10 Thread Joshua D. Drake
... thus, as I see it, the *primary* question is in fact argument (2).  That 
is, is information_schema sufficient, and if not, can it be extended without 
breaking SQL standards?   Argument (1) did not seem to have a lot of evidence 
on the "con" side, and the strongest argument against (3) is that we should 
use information_schema.
(2) The information_schema is good but not sufficient. It either needs 
more info as suggested by this thread or we need an extended version for 
Pg specifically.

(1) I can't see anyone in their right mind on the user space / support 
of users side arguing against the need for more information about 
PostgreSQL and the way it interacts.

(3) If we can use the information_schema let's do so. However it should 
not be a stopping block.

Sincerely,
Joshua D. Drake
Command Prompt. Inc.
--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedication Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Views, views, views! (long)

2005-05-10 Thread Jim C. Nasby
On Tue, May 10, 2005 at 04:55:40PM +0200, Peter Eisentraut wrote:
> Am Montag, 9. Mai 2005 00:41 schrieb Andrew - Supernews:
> > >> c) In most places, "system" objects are segregated from
> > >> "user" objects,  e.g. pg_user_indexes
> > >
> > > I think that is a bad idea as it goes against the fundamental design of
> > > PostgreSQL.
> >
> > In what way? Please elaborate.
> 
> PostgreSQL does not really distinguish between "system" and "user" things.  
> How will you do that?

It's currently done using this function:

create or replace function _pg_sv_system_schema(name) returns boolean
  as 'select $1 in (name ''pg_catalog'', name ''pg_toast'',
name ''pg_sysviews'', name ''information_schema'')'
  language sql immutable strict;

Objects that are in one of those schemas are considered system objects.
This is how pg_dump does it (except for casts, which are considered
system objects if the source type, destination type, and conversion
function are ALL in system schemas).

psql also distinguishes between system and user tables, although it
restricts this to pg_catalog.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Views, views, views: Summary of Arguments

2005-05-10 Thread Dann Corbit
Suggestion:
Use INFORMATION_SCHEMA for everything that INFORMATION_SCHEMA covers.
That way, there will not be needless duplications.

Create new tables with foreign keys to the INFORMATION_SCHEMA for
everything else.

Alternative suggestion:
Create any sort of magic, pg-specific schema you want, and create views
that map the stuff back to the INFORMATION_SCHEMA to fill
INFORMATION_SCHEMA out completely.

Both methods are equally good to me.

What would be painful (in my view) is if the new "custom" schema has
INFORMATION_SCHEMA data in it, and the INFORMATION_SCHEMA does not
contain that needed information (IOW: INFORMATION_SCHEMA lags behind
because the PG specific schema gets lots of work and the
INFORMATION_SCHEMA gets secondary attention).

As long as I get my INFORMATION_SCHEMA views, and as long as they are
fully populated, I would not care at all if there were additional
information somewhere else.

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
> [EMAIL PROTECTED] On Behalf Of Joshua D. Drake
> Sent: Tuesday, May 10, 2005 10:30 AM
> To: Josh Berkus
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Views, views, views: Summary of Arguments
> 
> >
> > ... thus, as I see it, the *primary* question is in fact argument
(2).
> That
> > is, is information_schema sufficient, and if not, can it be extended
> without
> > breaking SQL standards?   Argument (1) did not seem to have a lot of
> evidence
> > on the "con" side, and the strongest argument against (3) is that we
> should
> > use information_schema.
> 
> (2) The information_schema is good but not sufficient. It either needs
> more info as suggested by this thread or we need an extended version
for
> Pg specifically.
> 
> (1) I can't see anyone in their right mind on the user space / support
> of users side arguing against the need for more information about
> PostgreSQL and the way it interacts.
> 
> (3) If we can use the information_schema let's do so. However it
should
> not be a stopping block.
> 
> Sincerely,
> 
> Joshua D. Drake
> Command Prompt. Inc.
> 
> 
> --
> Your PostgreSQL solutions company - Command Prompt, Inc.
1.800.492.2240
> PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
> Managed Services, Shared and Dedication Hosting
> Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
> 
> ---(end of
broadcast)---
> TIP 4: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Jim C. Nasby
On Tue, May 10, 2005 at 12:01:54PM +0300, Adrian Maier wrote:
> > Personally, I think the biggest win here would be adding package support
> > and syntax to plpgsql. Not only would it make porting from Oracle
> > easier, it would also make plpgsql much, much more powerful.
> 
> Hello,
> 
> What do you think about having some kind of language-independent
> packages ?
> I'm thinking that it could be handy to implement some functions in
> plpgsql, some functions in plpython and so . And then bundle them
> together into the same package.

Personally, I basically only use plpgsql, but I can certainly see where
there would be value in being able to include functions and procedures
from multiple languages in one package. But I suspect this will also
make some things more difficult, such as global static variables.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Jim C. Nasby
On Mon, May 09, 2005 at 11:24:45PM +0200, Thomas Hallgren wrote:
> In Oracle you can use the syntax:
> 
> ..()
> 
> but you can just as well use the syntax:
> 
> ..()
> 
> Why do you need both? If PostgreSQL is going to add new nice features 
> that enables better namespace handling and global variables, take a look 
> at Oracles UDT's with static and instance methods. Only thing that I'm 
> not sure is there is static variables. If it's missing, we could add 
> that easilly and give them the same life-span as the session.

It's been a while since I used types, but here's some issues I can think
of:

I don't believe types allow for internal-only methods. I seem to recall
other limitations on what types could do as opposed to packages. Of
course, we need not restrict ourselves in such a manner.

Types are not used nearly as much as packages (this is an issue if we
care about enabling Oracle users to migrate).

Types generally force you to use them in relation to some database
object. Packages have no such restriction.

Don't get me wrong, I think supporting more powerful types would be a
welcome addition, but I don't think they can be as flexable as packages.
The good news is that they should both be able to use the same
underlying framework. Types are afterall just a specialized
implementation of packages.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Views, views, views! (long)

2005-05-10 Thread Andrew - Supernews
On 2005-05-10, Peter Eisentraut <[EMAIL PROTECTED]> wrote:
> Am Freitag, 6. Mai 2005 12:20 schrieb Andreas Pflug:
>> > and the information_schema is next to useless for these things since
>> >  it doesn't have PostgreSQL specific things in it.
>>
>> And the restriction to current user owned objects reduces usability to
>> zero.
>
> The information schema restricts the views to the objects to which you have 
> some access right, which doesn't seem all that useless.

Actually that varies according to the individual view. Some restrict to
objects to which you have access, others restrict to objects that you own.
Furthermore, in the latter case there is no exception for superusers.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Jim C. Nasby
On Mon, May 09, 2005 at 09:56:53PM -0400, Bruce Momjian wrote:
> OK, so it seems we need:
> 
>   C static/private functions for schemas
>   C static/private variables for schemas
> 
> Are private variables implemented via the temporary per-session schema?
> 
>   nested schemas
> 
> What does the standard say?
> 
> Is that it?

I think a big part of the usefulness of packages is in supplying an
additional level of grouping common things together. Of course, nested
schemas with public/private functions (and procedures, lest we forget
them) is a much better way to do this, since a schema can encompass
everything you'd need; tables, views, types, etc.

Having said that, I would say that private variables need to be exposed
via the same nested schema interface as everything else. If the
implementation under the covers is via the temporary schema, that's
fine.

As for using temporary tables as session storage, that has a huge
performance penalty associated with it. Part of the advantage to package
variables is that you can use them to cache information your code will
need to access frequently. That access then becomes a simple variable or
array read, which is obviously much faster than parsing a query to hit a
temp table.

There is one feature not mentioned by Bruce's design, and that's
initialization (and teardown) code. I don't recall using that capability
in Oracle, but I was wondering if others with more experience could
comment on it.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Please clarify

2005-05-10 Thread Heikki Linnakangas
On Tue, 10 May 2005 [EMAIL PROTECTED] wrote:
I need to check for the existence of a user defined view named 'audit_vw'
and if exists, then i need to delete the same. Please help me to solve the
issue.
If you don't need to do anything else in the transaction, you could just 
issue "DROP VIEW audit_vw" and see if it succeeds or fails.

If you do need to do something else in the same transaction, you can use 
savepoints:

BEGIN;
SAVEPOINT sp;
DROP VIEW audit_vw;
if it fails: ROLLBACK TO sp;
...
COMMIT;
Ps. This kind of questions are usually discussed on the pgsql-general 
mailing list.

- Heikki
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


FW: [HACKERS] Views, views, views! (long)

2005-05-10 Thread Merlin Moncure
my bad [thanks Greg]
> Peter, Merlin, Andrew,
> 
> > > > And the restriction to current user owned objects reduces
usability
> > to
> > > > zero.
> > >
> > > The information schema restricts the views to the objects to which
you
> > > have
> > > some access right, which doesn't seem all that useless.
> 
> There's a difference between restricting it to objects on which you
have
> rights (which our views do as well) vs. restricting it to objects you
> *own*,
> which is what the information_schema does, according to Andrew.  Yes?

Good point, although I'll take either over the current behavior.
Basically, I feel very strongly that *all* interface to database
metadata should be through views unless you happen to database root.  I
have personal reasons for this but I also think it's the correct
philosophy in general, even if the views are the incredibly simple:

create view pg_class_view as select * from pg_class;

As a side note I also like the idea of set returning functions because
of the extra flexibility of security invoker/definer...but I'll be a
huge cheerleader for any effort that gets user applications off of
dependency on internal system tables.

In a previous similar discussion Simon noted that the Terradata database
had different view configurations for different security requirements.

my 0.02$ (and that's about all it's worth :-) )

Merlin
 


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Table Partitioning, Part 1

2005-05-10 Thread Jim C. Nasby
On Tue, May 10, 2005 at 12:16:17AM +0100, Simon Riggs wrote:
> On Mon, 2005-05-09 at 18:53 -0400, Tom Lane wrote:
> > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > 1. Embellish inheritance or separate infrastructure?
> > 
> > > It seems prudent to avoid building on that foundation, even though we
> > > may decide to use some similar approaches.
> > 
> > I disagree.  The code is there, it could use work, and what you are
> > basically proposing is to duplicate both the existing work and much
> > of the improvement it needs.
> 
> Minefields need clearing someday, I suppose. 
> 
> Multiple inheritance isn't something I'll be spending time on though.

I'm also not sure that inheritance would support all cases. For example,
in some situations PPUC3 leads to doing individual value partitioning,
where a partition is guaranteed to have only one value for part of the
PPK, meaning that there's no reason to store that part of the key in the
partition itself. Currently this is possible with partitions built out
of views but not out of inherited tables.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Views, views, views: Summary of Arguments

2005-05-10 Thread David Fetter
On Tue, May 10, 2005 at 10:21:06AM -0700, Josh Berkus wrote:
> Folks,
> 
> We've meandered a bit on this, so I wanted to summarize the
> arguments presented on the new system views to date so that we might
> have some hope of consensus before feature freeze.   
> 
> As I see it, there are 3 main arguments about having the new system
> views at all.  These obviously need to be settled before we go any
> further on security models, column names, etc.   Please add if I've
> missed anyone's arguments, I'm trying to summarize across 2 weeks of
> discussion and am obviously not impartial.
> 
> Argument (1):  Are the views useful to users?

I've used some of them already at work :)

> Pro: Several people, particularly the proposers, contend that they
> are.  They cite as evidence the popularity of related articles on
> General Bits, commercial precedent, and the prevalence of
> user-created system views.  Mostly, the usefulness is aimed at new
> users.
> Con: A few people say that they are not useful, and that the system
> tables are easily understood.

Anybody who contends that the system tables are easily understood is
more than welcome to use this understanding, and will not be impeded
by the existence of things they don't choose to use.  The
aforementioned understanding--quite rare, but that's almost beside the
point--is not an argument for keeping tools out of the hands of people
for whom the internals of the PostgreSQL implementation are not
intuitively obvious.

> Argument (2): Do they provide sufficiently distinct functionality
> from the information_schema?
> Pro:  The proposers contend that the information_schema, by SQL
> spec, is unable to show all PostgreSQL objects in sufficient detail.
> That the permissions and uniqueness models are wrong for PostgreSQL,
> and these things are not easily fixed by extension without breaking
> the SQL spec.  That we don't want to confuse the information_schema
> with PostgreSQL-specific extensions.
> Con: Several people, most notably Peter, contend that much of the
> new system views are duplicative of information_schema, and that
> efforts should be made to extend infomation_schema instead of
> providing a parallel interface.  That we should make serious efforts
> to support a standard rather than developing a proprietary
> interface.  A few people claimed that there was nothing that
> information_schema didn't have, or that users didn't need that
> information anyway.

> Argument (3): Would the new system views be useful to interface designers?
> Pro:  Christopher Kings-Lynne said yes for phpPgAdmin.  Josh argued that we 
> need to look at interface designers who are designing for 3rd-party 
> multi-database products who are not supporting PostgreSQL yet and will be 
> unlikely to learn the system tables. 
> Con:  Dave Page said no for pgAdmin.  Several people pointed out issues with 
> the idea of maintaining backwards compatibility through abstraction.  Others 
> cited argument (2) in favor of information_schema, above.

> ... thus, as I see it, the *primary* question is in fact argument
> (2).  That is, is information_schema sufficient,

Not by a long, long way.

> and if not, can it be extended without breaking SQL standards?

The information schema, by its nature, cannot contain information
about things like indexes because that would be implementation-
specific information, and the information schema is, by design,
implementation-neutral.

Just my $.02 :)

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Thomas Hallgren
Jim C. Nasby wrote:
I don't believe types allow for internal-only methods. I seem to recall
other limitations on what types could do as opposed to packages. Of
course, we need not restrict ourselves in such a manner.
 

Do Oracle packages support internal only functions? If they do, then I 
agree, that's a feature that the SQL standard doesn't have.

Types are not used nearly as much as packages (this is an issue if we
care about enabling Oracle users to migrate).
Types generally force you to use them in relation to some database
object. Packages have no such restriction.
 

If used as a package, i.e. only containing static methods, you don't 
need to use the type in relation to anything. It's simply a namespace. 
If used with a temporary table, you get a very neat, standardized, 
cross-language way of managing session data.

Don't get me wrong, I think supporting more powerful types would be a
welcome addition, but I don't think they can be as flexable as packages.
 

I see this differently. A full implementation of the SQL-standard for 
UDT's will allow abstract types, inheritance, method overloading, etc. 
It quickly becomes far more flexible then Oracle packages. A full 
implementation is of course beyond the scope for what's needed to 
accommodate the needs of those who use packages but a simple 
implementation is extendable within the scope of the standard.

The good news is that they should both be able to use the same
underlying framework. Types are afterall just a specialized
implementation of packages.
 

Right. Given a good implementation of types, packages would be easy to 
implement. The other way around would not be possible. A package is a 
very restricted type that contains static methods only. Possibly with 
the extension of some kind of method/attribute visibility.

So do we need "internal only" functions although they are not covered by 
the SQL-standard? If the answer is no, then IMO we should follow the 
standard and use types, not packages. If the answer is yes, then the  
SQL-standard is not enough. Should we then use packages or simply 
introduce the keyword PRIVATE on methods of a type? Personally, I'd go 
for the latter and then, if necessary, build packages on top of that in 
for the benefit of Oracle users who wants to migrate. A fully fledged 
type system will ease Oracle migration too since Oracle already has this.

Regards,
Thomas Hallgren

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Jim C. Nasby
On Tue, May 10, 2005 at 08:40:16PM +0200, Thomas Hallgren wrote:
> Jim C. Nasby wrote:
> 
> >I don't believe types allow for internal-only methods. I seem to recall
> >other limitations on what types could do as opposed to packages. Of
> >course, we need not restrict ourselves in such a manner.
> > 
> >
> Do Oracle packages support internal only functions? If they do, then I 
> agree, that's a feature that the SQL standard doesn't have.

AFAIK anything in an Oracle packages can be internal-only. You just
exclude it from the header.

> I see this differently. A full implementation of the SQL-standard for 
> UDT's will allow abstract types, inheritance, method overloading, etc. 
> It quickly becomes far more flexible then Oracle packages. A full 
> implementation is of course beyond the scope for what's needed to 
> accommodate the needs of those who use packages but a simple 
> implementation is extendable within the scope of the standard.

Packages allow for abstract types and method overloading. They don't
allow for inheritance.

> >The good news is that they should both be able to use the same
> >underlying framework. Types are afterall just a specialized
> >implementation of packages.
> > 
> >
> Right. Given a good implementation of types, packages would be easy to 
> implement. The other way around would not be possible. A package is a 
> very restricted type that contains static methods only. Possibly with 
> the extension of some kind of method/attribute visibility.

I guess maybe I'm not clear on what you mean by static methods. IIRC, in
Oracle nomenclature, static means it will retain state between
invocations in the same session. Of course, functions and procedures
that don't do this are also allowed.

Basically, before we assume that one implementation allows for the other
I think some research needs to be done. Hopefully someone on the list is
familiar with both. I think it would be a huge win if we could offer a
compatability mechanism that makes it easy for Oracle packages to be
used in PostgreSQL, making migration from Oracle much, much easier.

> So do we need "internal only" functions although they are not covered by 
> the SQL-standard? If the answer is no, then IMO we should follow the 
> standard and use types, not packages. If the answer is yes, then the  
> SQL-standard is not enough. Should we then use packages or simply 
> introduce the keyword PRIVATE on methods of a type? Personally, I'd go 
> for the latter and then, if necessary, build packages on top of that in 
> for the benefit of Oracle users who wants to migrate. A fully fledged 
> type system will ease Oracle migration too since Oracle already has this.

I think both should allow for private functions/procedures/methods. BTW,
I'm also very keen on the idea of nested schemas, which is another
possible means to the package ends.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Table Partitioning, Part 1

2005-05-10 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> On Tue, May 10, 2005 at 12:16:17AM +0100, Simon Riggs wrote:
>> On Mon, 2005-05-09 at 18:53 -0400, Tom Lane wrote:
>>> I disagree.  The code is there, it could use work, and what you are
>>> basically proposing is to duplicate both the existing work and much
>>> of the improvement it needs.
>> 
>> Minefields need clearing someday, I suppose. 
>> 
>> Multiple inheritance isn't something I'll be spending time on though.

> I'm also not sure that inheritance would support all cases.

My point seems to have been widely misunderstood ;-)

I was not suggesting that partitioning must be built on top of
inheritance, nor vice versa, nor that they need to support exactly
the same feature sets.  What I am saying is that if you adopt an
NIH attitude to the existing code, you are going to end up with a
lot of duplication.  There is a substantial amount of potentially
common infrastructure, as well as common problems that you might as
well solve for both cases at once.  (Remember the inventor's paradox:
the more general problem is often easier to solve.)  In particular,
the planning problems look essentially the same to me, as does the
indexing problem.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Thomas Hallgren
Jim C. Nasby wrote:
I guess maybe I'm not clear on what you mean by static methods. IIRC, in
Oracle nomenclature, static means it will retain state between
invocations in the same session. Of course, functions and procedures
that don't do this are also allowed.
 

A STATIC prefix on a method simply means that it is not tied to a 
particular instance of the type where it is defined.

You have the type Foo with the method bar(). If the method is STATIC, 
you can use:

SELECT Foo.bar();
If it's an INSTANCE method, you can only call it when you have an 
instance available, so if FooTable is a table described by the type Foo 
and bar is non-static, you could write:

SELECT x.bar() FROM FooTable x;
I think both should allow for private functions/procedures/methods. BTW,
I'm also very keen on the idea of nested schemas, which is another
possible means to the package ends.
 

I'd like that too although I don't think it's included in the SQL-standard.
Regards,
Thomas Hallgren

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Views, views, views! (long)

2005-05-10 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> On Tue, May 10, 2005 at 04:55:40PM +0200, Peter Eisentraut wrote:
>> PostgreSQL does not really distinguish between "system" and "user" things.  
>> How will you do that?

> It's currently done using this function:

> create or replace function _pg_sv_system_schema(name) returns boolean
>   as 'select $1 in (name ''pg_catalog'', name ''pg_toast'',
> name ''pg_sysviews'', name ''information_schema'')'
>   language sql immutable strict;

> Objects that are in one of those schemas are considered system objects.
> This is how pg_dump does it

Peter's point still stands though: the *system* isn't making that
distinction.  pg_dump needs to make a distinction so that it doesn't
dump built-in objects; which is not necessarily the same distinction
that a user might want to make.  Thus, the fact that psql does it a
bit differently isn't necessarily a bug.

I think the real problem here is that it's hard to be all things to all
people.  If you suppress display of certain objects, that may be nice
suppression of clutter for one user, yet render the view useless from
the perspective of another user --- or even the same user on a different
day, when he is looking for a particular built-in function for instance.
(I know it's always bugged the heck out of me that \df editorializes on
which functions it thinks I want to see.)

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Bruce Momjian
Dave Held wrote:
> > -Original Message-
> > From: Bruce Momjian [mailto:[EMAIL PROTECTED]
> > Sent: Tuesday, May 10, 2005 8:43 AM
> > To: Thomas Hallgren
> > Cc: Tom Lane; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
> > Subject: Re: [HACKERS] Oracle Style packages on postgres
> > 
> > [...]
> > I suppose.  I think we should focus on the use cases for Oracle
> > packages, rather than the specific functionality it provides. 
> > What things do people need PostgreSQL to do that it already
> > doesn't do?
> 
> Is that really the best way to go about things?  Already RDBMSes
> are patchwork quilts of functionality.  Is merely adding another
> patch the most elegant way to evolve the database?  The problem is
> that Oracle et al are trying to be ORDBMSes and aren't exactly sure
> what the best way to go is.  Instead of trying to formulate a 
> rational plan for what an ORDBMS should even look like, they simply
> look at what would work with their existing infrastructure and tack
> on features.  Then Postgres plays the copycat game.  Instead of
> trying to play catch-up with Oracle, why not beat them at their own
> game?

I was unclear. I was suggesting exactly what you posted, that we look at
what functionality we _need_ from Oracle packages, rather than the
functionality of Oracle packages themselves.  My assumption is that
Oracle does some things we need, and some things we don't, and does them
in some ways we will like, and others we will not, so let's look at the
actuall use cases that we need to address.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Views, views, views! (long)

2005-05-10 Thread Josh Berkus
Tom,

> I think the real problem here is that it's hard to be all things to all
> people.  If you suppress display of certain objects, that may be nice
> suppression of clutter for one user, yet render the view useless from
> the perspective of another user --- or even the same user on a different
> day, when he is looking for a particular built-in function for instance.
> (I know it's always bugged the heck out of me that \df editorializes on
> which functions it thinks I want to see.)

But all of the views we've composed as pg_user_* also have a pg_all_*.  So 
users can do what they want.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Can we get patents?

2005-05-10 Thread David Walker
> That depends; is the SFLC offering to pay for the patent applications?  Last 
> I 
> checked, it was somewhere around $6000 per patent.

Nolo press (www.nolo.com) sells a book on patents. Many people file their own 
patent applications successfully. The cost is less that $1000.

David



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files

2005-05-10 Thread Heikki Linnakangas
On Sun, 8 May 2005, Tom Lane wrote:
While your original patch is buggy, it's at least fixable and has
localized, limited impact.  I don't think these schemes are safe
at all --- they put a great deal more weight on the semantics of
the filesystem than I care to do.
I'm going to try this some more, because I feel that a scheme like this 
that doesn't rely on scanning pg_class and the file system would in fact 
be safer.

The key is to A) obey the "WAL first" rule, and A) remember information 
about file creations over a checkpoint. The problem with the my previous 
suggestion was that it didn't reliably accomplish either :).

Right now we break the WAL rule because the file creation is recorded 
after the file is created. And the record is not flushed.

The trivial way to fix that is to write and flush the xlog record before 
actually creating the file. (for a more optimized way to do it, see end of 
message). Then we could trust that there aren't any files in the data 
directory that don't have a corresponding record in WAL.

But that's not enough. If a checkpoint occurs after the file is 
created, but before the transaction ends, WAL replay doesn't see the file 
creation record. That's why we need a mechanism to carry the information 
over the checkpoint.

We could do that by extending the ForwardFsyncRequest function or by
creating something similar to that. When a backend writes the file 
creation WAL record, it also sends a message to the bgwriter that says 
"I'm xid 1234, and I have just created file foobar/1234" (while holding 
CheckpointStartLock). Bgwriter keeps a list of xid/file pairs like it 
keeps a list of pending fsync operations. On checkpoint, the checkpointer 
scans the list and removes entries for transactions that have already 
ended, and attaches the remaining list to the checkpoint record.

WAL replay would start with the xid/file list in the checkpoint record, 
and update it during the replay whenever a file creation or a transaction 
commit/rollback record is seen. On a rollback record, files created by 
that transaction are deleted. At the end of WAL replay, the files that are 
left in the list belong to transactions that implicitly aborted, and can 
be deleted.

If we don't want to extend the checkpoint record, a separate WAL record 
works too.

Now, the more optimized way to do A:
Delay the actual file creation until it's first written to. The write 
needs to be WAL logged anyway, so we would just piggyback on that.

Implemented this way, I don't think there would be a significant 
performance hit from the scheme. We would create more ForwardFsyncRequest 
traffic, but not much compared to the block fsync requests we have right 
now.

BTW: If we allowed mdopen to create the file if it doesn't exist already, 
would we need the current file creation xlog record for anything? (I'm 
not suggesting to do that, just trying to get more insight)

- Heikki
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files

2005-05-10 Thread Bruce Momjian
Heikki Linnakangas wrote:
> On Sun, 8 May 2005, Tom Lane wrote:
> 
> > While your original patch is buggy, it's at least fixable and has
> > localized, limited impact.  I don't think these schemes are safe
> > at all --- they put a great deal more weight on the semantics of
> > the filesystem than I care to do.
> 
> I'm going to try this some more, because I feel that a scheme like this 
> that doesn't rely on scanning pg_class and the file system would in fact 
> be safer.

The current code is nice and localized and doesn't add any burden on our
existing code, which is already complicated enough.  I think we either
fix checkfiles.c, or we remove it and decide it isn't worth checking for
unrefrenced files.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Can we get patents?

2005-05-10 Thread Dave Cramer
I've talked to a friend of mine who is a patent lawyer.
1) in Europe if it is in the public domain then it cannot be patented
2) in North America you would have to patent before submitting to the 
project.
3) His question was why? With a bsd license you can't stop anyone from 
using it and nobody
else can patent it since by placing it in the project you are 
establishing prior art.

Dave
David Walker wrote:
That depends; is the SFLC offering to pay for the patent applications?  Last 
I 
checked, it was somewhere around $6000 per patent.
   

Nolo press (www.nolo.com) sells a book on patents. Many people file their own 
patent applications successfully. The cost is less that $1000.

David

---(end of broadcast)---
TIP 8: explain analyze is your friend
 

--
Dave Cramer
http://www.postgresintl.com
519 939 0336
ICQ#14675561
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files

2005-05-10 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> On Sun, 8 May 2005, Tom Lane wrote:
>> While your original patch is buggy, it's at least fixable and has
>> localized, limited impact.  I don't think these schemes are safe
>> at all --- they put a great deal more weight on the semantics of
>> the filesystem than I care to do.

> I'm going to try this some more, because I feel that a scheme like this 
> that doesn't rely on scanning pg_class and the file system would in fact 
> be safer.

I think this proposal is getting more and more invasive and expensive,
and it's all to solve a problem that we do not even know is worth
spending any time on.  I *really* think this is the wrong direction
to take.  Aside from the required effort and risk of breaking things,
the original patch incurred cost only during crash recovery; this is
pushing costs into the normal code paths.

> Delay the actual file creation until it's first written to. The write 
> needs to be WAL logged anyway, so we would just piggyback on that.

This is a bad idea since by then it's (potentially) too late to roll
back the creating transaction if the creation fails.  Consider for
instance a tablespace directory that's mispermissioned read-only, or
some such.  I'd rather have the CREATE TABLE fail than a later INSERT.
(Admittedly, we can't completely guarantee that an INSERT won't hit
some kind of filesystem-level problem, but it's still something to
try to avoid.)

Also, the "first write" actually comes from mdextend, which is not a
WAL-logged operation AFAIR.  Some rethinking of that would be necessary
before this would have any chance of working.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files

2005-05-10 Thread Heikki Linnakangas
On Tue, 10 May 2005, Bruce Momjian wrote:
The current code is nice and localized and doesn't add any burden on our
existing code, which is already complicated enough.  I think we either
fix checkfiles.c, or we remove it and decide it isn't worth checking for
unrefrenced files.
Let's pull the patch for now.
I still think we should check for unreferenced files, but it needs some 
more thought and it's by no means urgent.

If I have time later on, I might write a patch to implement the WAL-based 
scheme to see how much change to existing code it really would need.

- Heikki
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-10 Thread Simon Riggs
On Tue, 2005-05-10 at 10:34 -0400, Bruce Momjian wrote:
> Tom Lane wrote:
> > "Mark Cave-Ayland" <[EMAIL PROTECTED]> writes:
> > > I was just researching some articles on compression (zlib) and I saw 
> > > mention
> > > of the Adler-32 algorithm which is supposed to be slightly less accurate
> > > than an equivalent CRC calculation but significantly faster to compute. I
> > > haven't located a good paper comparing the error rates of the two 
> > > different
> > > checksums,
> > 
> > ... probably because there isn't one.  With all due respect to the Zip
> > guys, I doubt anyone has done anywhere near the analysis on Adler-32
> > that has been done on CRCs.  I'd much prefer to stick with true CRC
> > and drop it to 32 bits than go with a less-tested algorithm.  Throwing
> > more bits at the problem doesn't necessarily create a safer checksum.
> 
> Agreed.  64-bit was overkill when we added it, and it is now shown to be
> a performance problem.

Hold on... Tom has shown that there is a performance problem with the
existing CRC calculation. Thanks to Mark for staying on top of that with
some good ideas. 

The cause of the performance problem has been attributed to it being a
64-bit rather than 32-bit calculation. That is certainly part of it, but
I have seen evidence that there is an Intel processor stall associated
with the use of a single byte constant somewhere in the algorithm. So
I'm unclear as to what extent the poor performance is attributable to
either issue.

That's where my experience stops so I have highlighted that for somebody
with more hardware specific assembler experience to have a look at the
algorithm. Fixing that, if possible, could greatly improve the
performance whether or not we drop from 64 to 32 bits. My hope for
outside assistance on that looks like it is not now forthcoming.

My guess would be that the algorithm's use of the byte-by-byte
calculation together with a bitmask of &FF is responsible. Perhaps
varying the length of the bitmask to either &00FF or longer may
help? (see src/include/xlog_internal.h)

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Table Partitioning, Part 1

2005-05-10 Thread Simon Riggs
On Tue, 2005-05-10 at 16:31 +0300, Hannu Krosing wrote:
> On E, 2005-05-09 at 23:30 +0100, Simon Riggs wrote:
> > ISTM fairly straightforward to produce a similar "static" plan along the
> > same lines, using Result nodes to implement Partition Elimination.
> > 
> > Append
> >   Result
> > SeqScan
> >   Result
> > SeqScan
> >   Result
> > SeqScan
> 
> So you mean another node that looks inside seqscans restrictions and
> then determines if there is any chance that seqscan will return any
> rows ? 
> 
> Why can't this be done at planning time ? And if it can't be done at
> planning time, then how do you determine which plan is cheapest ?

It can if you have constants. If you have you have bind variables or
Stable functions then it can only be done for certain at execution time,
though can be estimated prior at planning time. This is exactly what is
done currently in the planner for constant eval and simplification.

> > If all partitions in the query had identical indexes on them, then we
> > have another option. In that case, each index could be thought to form
> > part of a larger index ordered initially on the Partitioning Key (PPK).
> > If the first column was actually the PPK, then the set of indexes would
> > be exactly equivalent to a Global Index. We can call this a Pseudo
> > Global Index.
> > 
> > The Pseudo Global Index could also be used to enforce uniqueness. If all
> > of the composite indexes were defined unique and the index contained the
> > PPK as one of its columns, this would work. 
> >
> > The index enforces
> > uniqueness within each partition and the PPK enforces uniqueness across
> > partitions because the same PPK value cannot be in two partitions.
> 
> But only uniqueness of PPK, not any other columns.

No, it would work for *any* set of columns that included the PPK.

> > We can also use the Pseudo Global Index as the object of a multi-index
> > scan node, since the PGI would be ordered on the PPK. The plan node
> > would use the partition boundary definitions to locate the starting
> > partition, then seek within the partition to the correct part of the
> > index then scan sideways, skipping to the next partition as required.
> > 
> > This sounds a great idea, but it is exactly the technique I argued
> > against only 4 paragraphs ago for use with SeqScans.
> 
> Also this works only for PPK and not other columns. 

This would work for any set of columns, as long as the PPK were the
leftmost column in the set.

> > My current thinking is that we need a mix of the two approaches, though
> > I raise all of this here to gain everybody's input...
> 
> I'm afraid that hiding the decisions inside the access methods will make
> things harder for the optimiser .

Me too.

> Still there may be cases where smarter access methods make sense as an
> additional feture, though I cant come up with an example right now.

Look at PPUC 2 Join partition elimination, which is the classic Fact to
TimeDimension join.


Hannu,

It's time to follow your thinking for the first implementation:
just implement Constraints as possible gating tests on each inherited
relation. No syntax changes, just some light but complex internals.

Once that works, we can discuss improving it to solve other required use
cases. Until we're at that stage, we should leave further discussion to
one side, now that we have the basic architecture agreed: inheritance.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Table Partitioning, Part 1

2005-05-10 Thread Simon Riggs
On Tue, 2005-05-10 at 16:44 +0300, Hannu Krosing wrote:
> On T, 2005-05-10 at 16:31 +0300, Hannu Krosing wrote:
> > On E, 2005-05-09 at 23:30 +0100, Simon Riggs wrote:
> 
> > There are 2 possibly expensive steps: 
> > 
> > 1. the conversion to "AND'ed list of simple clauses" (unknown
> > complexity) 
> > 
> > 2. matching each of "simple" clauses in the and list with all others
> > (should be N+(N-1)+(N-2)+..+(1) ~= 2N) complexity)
> 
> actually not 2N but (N * ((N-1)/2) , thus 3 clauses need 2+1=3 checks
> and 11 clasues need (10+9+..+1) = 55 checks.

Well, it doesn't need to be quite that bad.

We can just check each of the table constraints against each of the
Restrict clauses. 

We don't need to test the constraints against themselves, since the
table would be empty if any ever turned out to be mutually exclusive, so
we wouldn't be saving time doing it. Later, we can enforce constraints
to be mutually exclusive at the time they are created.

We don't need to test the Restrict clauses against themselves either,
since we can assume that the user is smart enough to submit SQL that
returns some rows. Eval will pick up the stupid case if there is one.

Anyway, the existing code has an even simpler heuristic for avoiding the
full check...

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-10 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> The cause of the performance problem has been attributed to it being a
> 64-bit rather than 32-bit calculation. That is certainly part of it, but
> I have seen evidence that there is an Intel processor stall associated
> with the use of a single byte constant somewhere in the algorithm.

That's awfully vague --- can't you give any more detail?

I have seen XLogInsert eating significant amounts of time (up to 10% of
total CPU time) on non-Intel architectures, so I think that dropping
down to 32 bits is warranted in any case.  But if you are correct then
that might not fix the problem on Intel machines.  We need more info.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files

2005-05-10 Thread Bruce Momjian
Heikki Linnakangas wrote:
> On Tue, 10 May 2005, Bruce Momjian wrote:
> 
> > The current code is nice and localized and doesn't add any burden on our
> > existing code, which is already complicated enough.  I think we either
> > fix checkfiles.c, or we remove it and decide it isn't worth checking for
> > unrefrenced files.
> 
> Let's pull the patch for now.

Removed, and TODO item restored.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Views, views, views: Summary of Arguments

2005-05-10 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


> interface designers who are designing for 3rd-party multi-database
> products who are not supporting PostgreSQL yet and will be
> unlikely to learn the system tables

There's a scary thought.

So they are willing to learn the new system views, but not the system
tables? The above seems an argument for I_S, or at least an expanded I_S.

So... the reason we don't want to expand (not alter) I_S is that it is a
"standard" that very few RDBMS actually bother to implement, is already
out of date, and is incomplete? Seems we bend the rules in other ways when
needed (e.g. lowercase relation names), we could certainly add additional
tables and columns here, while maintaining the "standard" set for applications
looking for them.

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200505100635
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iD8DBQFCgI6UvJuQZxSWSsgRAsp3AJ9aY8qeVzpKTcq5yXkhmtkJvuFRWACfXPST
TNNRK32VwbaHimNhB9hjWb8=
=Saja
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Table Partitioning, Part 1

2005-05-10 Thread Simon Riggs
On Tue, 2005-05-10 at 15:01 -0400, Tom Lane wrote:
> "Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> > On Tue, May 10, 2005 at 12:16:17AM +0100, Simon Riggs wrote:
> >> On Mon, 2005-05-09 at 18:53 -0400, Tom Lane wrote:
> >>> I disagree.  The code is there, it could use work, and what you are
> >>> basically proposing is to duplicate both the existing work and much
> >>> of the improvement it needs.
> >> 
> >> Minefields need clearing someday, I suppose. 
> >> 
> >> Multiple inheritance isn't something I'll be spending time on though.
> 
> > I'm also not sure that inheritance would support all cases.
> 
> My point seems to have been widely misunderstood ;-)
> 
> I was not suggesting that partitioning must be built on top of
> inheritance, nor vice versa, nor that they need to support exactly
> the same feature sets.  What I am saying is that if you adopt an
> NIH attitude to the existing code, you are going to end up with a
> lot of duplication.  There is a substantial amount of potentially
> common infrastructure, as well as common problems that you might as
> well solve for both cases at once.  (Remember the inventor's paradox:
> the more general problem is often easier to solve.)  In particular,
> the planning problems look essentially the same to me, as does the
> indexing problem.

Agreed - no worries.

I was already seeing that I can't solve all of the Bizgres use cases at
once, so its time for me to pick one and make that work. If I can do at
least one in a way that solves a general case problem, then I'm happy to
do it that way. We may learn something during the journey about the best
approach for the other use cases.

...and at least we would have something to put into 8.1

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Bruce Momjian

OK, so it seems we need:

o  make private objects accessable only to objects
   in the same schema
o  Allow current_schema.objname to access current 
   schema objects
o  session variables
o  nested schemas?

---

Dave Held wrote:
> > -Original Message-
> > From: Bruce Momjian [mailto:[EMAIL PROTECTED]
> > Sent: Tuesday, May 10, 2005 8:43 AM
> > To: Thomas Hallgren
> > Cc: Tom Lane; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
> > Subject: Re: [HACKERS] Oracle Style packages on postgres
> > 
> > [...]
> > I suppose.  I think we should focus on the use cases for Oracle
> > packages, rather than the specific functionality it provides. 
> > What things do people need PostgreSQL to do that it already
> > doesn't do?
> 
> Is that really the best way to go about things?  Already RDBMSes
> are patchwork quilts of functionality.  Is merely adding another
> patch the most elegant way to evolve the database?  The problem is
> that Oracle et al are trying to be ORDBMSes and aren't exactly sure
> what the best way to go is.  Instead of trying to formulate a 
> rational plan for what an ORDBMS should even look like, they simply
> look at what would work with their existing infrastructure and tack
> on features.  Then Postgres plays the copycat game.  Instead of
> trying to play catch-up with Oracle, why not beat them at their own
> game?
> 
> What packages provide is encapsulation.  Hiding the data from the
> user and forcing him/her to use the public interface (methods).
> That is an important and admirable OO feature.  Some people think
> that using the DB's security model can achieve the same thing.  It
> can't, exactly, but there's an important lesson to be learned from
> the suggestion.  The problem is that OOP is a *programming* paradigm,
> and a database is not a *programming language*.  In a programming
> language, there really is no such thing as "security".  There is 
> only "visibility" and "accessibility".  Private methods in an OOP
> language do not provide *security*; they only limit *accessibility*.
> Like so many other differences between the relational model and the
> OOP model, there is an impedance mismatch here.  However, there is
> also opportunity.
> 
> In an OOPL, you can say: "Users can call this method from here, but
> not from there."  What you *can't* say is: "User X can call this
> method, but User Y cannot."  As you can see, these are orthogonal
> concepts.  You could call the first "accessibility by location" and
> the second "accessibility by authentication".  An ORDBMS should
> support both.  "Private" does not respect your identity, only your
> calling location.  An ACL does not respect your calling scope, only
> your identity.  A system that has both is clearly more flexible than
> one that only has one or the other.
> 
> Now what you need to keep in mind is that each visibility model 
> serves a different purpose.  The purpose of a security model is to 
> limit *who* can see/touch certain data because the data has intrinsic 
> value.  The purpose of an accessibility model is to limit *where* and 
> *how* data can be seen/touched in order to preserve *program 
> invariants*.  So if you have an object (or tuple!) that records the 
> start and stop time of some process, it is probably a logical 
> invariant that the stop time is greater than or equal to the start 
> time.  For this reason, in a PL, you would encapsulate these fields 
> (attributes) and only provide controlled access to update them that 
> checks and preserves the invariant, *no matter who you are*.  You 
> don't want a superuser violating this invariant any more than Sue 
> User.
> 
> Now you might object that constraints allow you to preserve 
> invariants as well, and indeed they do.  But constraints do not
> respect calling scope.  Suppose there is a process that needs to
> update the timestamps in a way that temporarily breaks the invariant
> but restores it afterwards.  The only way to effect this in a
> constraint environment is to drop the constraint, perform the
> operation, and restore it.  However, dropping a constraint is not an
> ideal solution because there may be other unprivileged processes 
> operating on the relation that still need the constraint to be 
> enforced.  There is no way to say: "There is a priviledged class of 
> methods that is allowed to violate this constraint because they are 
> trusted to restore it upon completion."  Note that this is different
> from saying "There is a priviledged class of users that is allowed
> to violate this constraint."  If you try to do something like give
> read-only access to everybody and only write access to one user and
> define that user to be the owner of the methods that update the data,
> you have to follow the convention that that user only operates 
> through the defined interface, and doesn't hack the data directly.
> Th

Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-10 Thread Simon Riggs
On Tue, 2005-05-10 at 18:22 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > The cause of the performance problem has been attributed to it being a
> > 64-bit rather than 32-bit calculation. That is certainly part of it, but
> > I have seen evidence that there is an Intel processor stall associated
> > with the use of a single byte constant somewhere in the algorithm.
> 
> That's awfully vague --- can't you give any more detail?
> 
> I have seen XLogInsert eating significant amounts of time (up to 10% of
> total CPU time) on non-Intel architectures, so I think that dropping
> down to 32 bits is warranted in any case.  But if you are correct then
> that might not fix the problem on Intel machines.  We need more info.

I have seen an Intel VTune report that shows a memory stall causing high
latency associated with a single assembly instruction that in the
compiled code of the CRC calculation. The instruction was manipulating a
single byte only. I couldn't tell exactly which line of PostgreSQL code
produced the assembler. This could be either a partial register stall or
a memory order buffer stall (or another?)

Here's a discussion of this
http://www.gamasutra.com/features/19991221/barad_pfv.htm

Sorry, but thats all I know. I will try to obtain the report, which is
not in my possession.

I do *not* know with any certainty what the proportion of time lost from
the CRC calc proper in an idealised CPU against the time lost from this
hardware specific interaction. I don't know if non-Intel CPUs are
effected either.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] LEFT JOIN used in psql describe.c

2005-05-10 Thread Bruce Momjian
Bruce Momjian wrote:
> Tom Lane wrote:
> > "Greg Sabino Mullane" <[EMAIL PROTECTED]> writes:
> > >> Does anyone know why so many LEFT JOINs are used in psql/describe.c to
> > >> join to the pg_namespace table, like here:
> > 
> > > Yes, pg_relnamespace is definitely not null. I've actually already removed
> > > the left joins from my \df patch, since I had to rewrite some of the
> > > queries anyway. If this is wrong, please let me know of course!
> > 
> > I think the idea was to be certain to show every pg_proc entry (or other
> > catalog for other \d commands), no matter how badly broken the catalog
> > interrelationships might be.  If there's not an unarguable reason
> > for eliminating the left joins I'd be inclined to keep it like that.
> > What does an inner join buy here, other than brittleness?
> > 
> > (Yeah, I have the perspective of a developer who deals with broken
> > situations every day.  So?)
> 
> If we have problems with the system catalogs, I don't see how this join
> has a high probability of catching the problem.  If there was some known
> problem of the join not always working, I could see the use of LEFT
> JOIN, but there isn't, so it just seems confusing, and these queries are
> used by others as models of how to do system joins, so could confuse our
> users as well.
> 
> I think the LEFT JOIN should be removed unless there is a known problem,
> and if one shows up, we can re-add them later.

I still think that the LEFT JOINs used in psql system queries is
confusing and perhaps adds performance overhead while adding little
reliability, but no one else seems to think so so I will drop the idea.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Views, views, views: Summary of Arguments

2005-05-10 Thread Michael Glaesemann
On May 11, 2005, at 7:38, Greg Sabino Mullane wrote:
So they are willing to learn the new system views, but not the system
tables? The above seems an argument for I_S, or at least an expanded 
I_S.

So... the reason we don't want to expand (not alter) I_S is that it is 
a
"standard" that very few RDBMS actually bother to implement, is already
out of date, and is incomplete? Seems we bend the rules in other ways 
when
needed (e.g. lowercase relation names), we could certainly add 
additional
tables and columns here, while maintaining the "standard" set for 
applications
looking for them.

One of the reasons I've been impressed with PostgreSQL and its 
developers is that I've seen respect for the SQL specifications *except 
in cases where it would seriously break backwards compatibility*. In 
implementing new features, if the SQL spec has something to say about 
it, it's been my observation that good efforts have been made to 
comply, though sometimes other syntax or PostgreSQL extensions are 
made.

This is not to say the SQL spec is perfect. (At heart I lean toward the 
Date/Darwin relational model, but that's just me :) However, to take 
something that *is* specified by SQL (and if I understand correctly, 
was *implemented in PostgreSQL specifically for SQL compliance*, it 
would be a shame to break that. I think PostgreSQL's spec compliance is 
a nice bragging point as well -- we do the spec, and more :)

Additional views that depend where possible on the INFORMATION_SCHEMA 
could actually be a good thing, as the INFORMATION_SCHEMA follows the 
spec, it'd be less likely to change between versions and make 
maintenance easier. That said, I haven't looked at the work the new 
systems views people have done. I recognize their motivation, as the 
times I've needed to look at the current system tables, it's always 
been with the docs open right beside me, flipping between pages to see 
everything I need to join to get the information I want. I for one am 
happy and grateful that a group of people have taken it upon themselves 
to provide an easier way to view Postgres system information, and think 
that the additional views in some form would make a great addition to 
PostgreSQL.

Michael Glaesemann
grzm myrealbox com
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] LEFT JOIN used in psql describe.c

2005-05-10 Thread Marc G. Fournier
On Tue, 10 May 2005, Bruce Momjian wrote:
Bruce Momjian wrote:
Tom Lane wrote:
"Greg Sabino Mullane" <[EMAIL PROTECTED]> writes:
Does anyone know why so many LEFT JOINs are used in psql/describe.c to
join to the pg_namespace table, like here:

Yes, pg_relnamespace is definitely not null. I've actually already removed
the left joins from my \df patch, since I had to rewrite some of the
queries anyway. If this is wrong, please let me know of course!
I think the idea was to be certain to show every pg_proc entry (or other
catalog for other \d commands), no matter how badly broken the catalog
interrelationships might be.  If there's not an unarguable reason
for eliminating the left joins I'd be inclined to keep it like that.
What does an inner join buy here, other than brittleness?
(Yeah, I have the perspective of a developer who deals with broken
situations every day.  So?)
If we have problems with the system catalogs, I don't see how this join
has a high probability of catching the problem.  If there was some known
problem of the join not always working, I could see the use of LEFT
JOIN, but there isn't, so it just seems confusing, and these queries are
used by others as models of how to do system joins, so could confuse our
users as well.
I think the LEFT JOIN should be removed unless there is a known problem,
and if one shows up, we can re-add them later.
I still think that the LEFT JOINs used in psql system queries is
confusing and perhaps adds performance overhead while adding little
reliability, but no one else seems to think so so I will drop the idea.
I'm a bit confused here, but I believe Tom (at least how I read it) was 
agreeing with you about pulling the LEFT JOIN out ... "I think the LEFT 
JOIN should be removed unless there is a known problem, and if one shows 
up, we can re-add them later." ... or am I mis-quoting?


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


[HACKERS] --enable-thread-safety?

2005-05-10 Thread Palle Girgensohn
Hi!
Need a piece of advice here. I'm wrapping up the ports for FreeBSD, and jus 
wonder if it is perhaps clever to always add --enable-thread-safety to the 
configure args. Is there a big enough penalty for having it off by default, 
or can I just have it on always?

/Palle
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread David Fetter
On Tue, May 10, 2005 at 06:55:39PM -0400, Bruce Momjian wrote:
> 
> OK, so it seems we need:
> 
>   o  make private objects accessable only to objects
>  in the same schema
>   o  Allow current_schema.objname to access current 
>  schema objects
>   o  session variables
>   o  nested schemas?

Well, some kind of nestable namespace for objects, anyhow.

I'll look over the SQL:2003 draft and see if I can find anything along
that line in there.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] LEFT JOIN used in psql describe.c

2005-05-10 Thread Bruce Momjian
Marc G. Fournier wrote:
> On Tue, 10 May 2005, Bruce Momjian wrote:
> 
> > Bruce Momjian wrote:
> >> Tom Lane wrote:
> >>> "Greg Sabino Mullane" <[EMAIL PROTECTED]> writes:
> > Does anyone know why so many LEFT JOINs are used in psql/describe.c to
> > join to the pg_namespace table, like here:
> >>>
>  Yes, pg_relnamespace is definitely not null. I've actually already 
>  removed
>  the left joins from my \df patch, since I had to rewrite some of the
>  queries anyway. If this is wrong, please let me know of course!
> >>>
> >>> I think the idea was to be certain to show every pg_proc entry (or other
> >>> catalog for other \d commands), no matter how badly broken the catalog
> >>> interrelationships might be.  If there's not an unarguable reason
> >>> for eliminating the left joins I'd be inclined to keep it like that.
> >>> What does an inner join buy here, other than brittleness?
> >>>
> >>> (Yeah, I have the perspective of a developer who deals with broken
> >>> situations every day.  So?)
> >>
> >> If we have problems with the system catalogs, I don't see how this join
> >> has a high probability of catching the problem.  If there was some known
> >> problem of the join not always working, I could see the use of LEFT
> >> JOIN, but there isn't, so it just seems confusing, and these queries are
> >> used by others as models of how to do system joins, so could confuse our
> >> users as well.
> >>
> >> I think the LEFT JOIN should be removed unless there is a known problem,
> >> and if one shows up, we can re-add them later.
> >
> > I still think that the LEFT JOINs used in psql system queries is
> > confusing and perhaps adds performance overhead while adding little
> > reliability, but no one else seems to think so so I will drop the idea.
> 
> I'm a bit confused here, but I believe Tom (at least how I read it) was 
> agreeing with you about pulling the LEFT JOIN out ... "I think the LEFT 
> JOIN should be removed unless there is a known problem, and if one shows 
> up, we can re-add them later." ... or am I mis-quoting?

I am actually quoting myself in the posting, so the words are mine, not
Tom's.  

The basic issue is code simplicity vs. robustness, and I am leaning to
the former because there is no known robustness problem.  It is actually
opposite of our opinions on checking for unreferenced files, where I
want robustness (because it is a known problem) and Tom wants simplicity
(though he is flexible on this), so it seems the two of us switch sides
occasionally.  :-)

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Bruce Momjian
David Fetter wrote:
> On Tue, May 10, 2005 at 06:55:39PM -0400, Bruce Momjian wrote:
> > 
> > OK, so it seems we need:
> > 
> > o  make private objects accessable only to objects
> >in the same schema
> > o  Allow current_schema.objname to access current 
> >schema objects
> > o  session variables
> > o  nested schemas?
> 
> Well, some kind of nestable namespace for objects, anyhow.

How would nested namespaces be different from nested schemas?  I thought
the two were the same.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] --enable-thread-safety?

2005-05-10 Thread Bruce Momjian
Palle Girgensohn wrote:
> Hi!
> 
> Need a piece of advice here. I'm wrapping up the ports for FreeBSD, and jus 
> wonder if it is perhaps clever to always add --enable-thread-safety to the 
> configure args. Is there a big enough penalty for having it off by default, 
> or can I just have it on always?

I don't think there is any real penalty in PostgreSQL for having it on. 
I don't know what the operating system overhead is on FreeBSD.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files

2005-05-10 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> On Tue, 10 May 2005, Bruce Momjian wrote:
>> The current code is nice and localized and doesn't add any burden on our
>> existing code, which is already complicated enough.  I think we either
>> fix checkfiles.c, or we remove it and decide it isn't worth checking for
>> unrefrenced files.

> Let's pull the patch for now.

FWIW, I was OK with the idea of adding something similar to the given
patch to find out whether we had a problem or not.  With sufficient
evidence that lost files are a big problem, I'd be in favor of a
mechanism of the kind proposed in Heikki's latest messages.  The
disconnect for me at the moment is that there's no evidence to justify
that amount of effort/risk.  A startup-time patch would have provided
that evidence, or else have proven that it's not worth spending more
time on.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Can we get patents?

2005-05-10 Thread Bruno Wolff III
On Tue, May 10, 2005 at 16:57:01 -0400,
  Dave Cramer <[EMAIL PROTECTED]> wrote:
> 
> 3) His question was why? With a bsd license you can't stop anyone from 
> using it and nobody
> else can patent it since by placing it in the project you are 
> establishing prior art.

Nope. They can still be issued a patent and then you will have to come
up with some big bucks to get it overturned. The Patent Office isn't
going to go looking through the Postgres source when checking for prior
art. Even if you got the method published in a journal, the Patent Office
could still not see that the paper was the same technique as was being
used in the patent.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread David Fetter
On Tue, May 10, 2005 at 09:49:13PM -0400, Bruce Momjian wrote:
> David Fetter wrote:
> > On Tue, May 10, 2005 at 06:55:39PM -0400, Bruce Momjian wrote:
> > > 
> > > OK, so it seems we need:
> > > 
> > >   o  make private objects accessable only to objects in the same
> > >  schema
> > >   o  Allow current_schema.objname to access current 
> > >  schema objects
> > >   o  session variables
> > >   o  nested schemas?
> > 
> > Well, some kind of nestable namespace for objects, anyhow.
> 
> How would nested namespaces be different from nested schemas?  I
> thought the two were the same.

I was thinking of nested namespaces in the more limited sense of
namespaces for bundles of functions/stored procedures rather than a
full-on hierarchy where a table can have a schema which resides inside
another schema which resides...unless people really want to have it
that way.

In a slightly related situation, at least in my mind, it seems like
for full-on ORDBMS functionality, it should be possible to have a
column of type schema or setof record, &c., and be able to take these
things apart at each row.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Oracle Style packages on postgres

2005-05-10 Thread Tom Lane
Bruce Momjian  writes:
> OK, so it seems we need:

>   o  make private objects accessable only to objects
>  in the same schema
>   o  Allow current_schema.objname to access current 
>  schema objects
>   o  session variables
>   o  nested schemas?

There's been a lot of handwaving about nested schemas in this thread,
but no one has explained how they could actually *work* given the SQL
syntax rules.  In general, "a" is a column from the current table
set, "a.b" is a column b in table/alias a from the current query,
"a.b.c" is a column c from table b in schema a, "a.b.c.d" is a column
d from table c in schema b in catalog a, and any more than that is
a syntax error.  I do not see how to add nested schemas without creating
unworkable ambiguities, not to say outright violations of the spec.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] --enable-thread-safety?

2005-05-10 Thread Tom Lane
Bruce Momjian  writes:
> Palle Girgensohn wrote:
>> Need a piece of advice here. I'm wrapping up the ports for FreeBSD, and jus 
>> wonder if it is perhaps clever to always add --enable-thread-safety to the 
>> configure args. Is there a big enough penalty for having it off by default, 
>> or can I just have it on always?

> I don't think there is any real penalty in PostgreSQL for having it on. 
> I don't know what the operating system overhead is on FreeBSD.

More to the point: the overhead if any is all at the libc level.
If your libc is such that there isn't any penalty for thread support
(perhaps better stated "you pay the overhead whether you want it or
not") then go for it.  I believe this is the case in recent Linuxen,
but I don't know the state of play in BSDen.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]